Looks like the Great Firewall or something like it is preventing you from completely loading www.skritter.com because it is hosted on Google App Engine, which is periodically blocked. Try instead our mirror:

legacy.skritter.cn

This might also be caused by an internet filter, such as SafeEyes. If you have such a filter installed, try adding appspot.com to the list of allowed domains.

Words and characters

baumkuchen5000   November 24th, 2013 12:02p.m.

I am not quite sure I understand the statistics correctly.
I am guessing "words" means consisting of at least 2 or more characters. But if I learn a word which consists of one character which I haven't learned before, does that character also get added to the "character" count which I have learned?
And also is the true amount of words I have learned the number of "words" + number of "characters" - number of "radicals"?? How do I know how many radicals I have learned?

nick   December 1st, 2013 9:13p.m.

Sounds like you've got it: words are 2+ characters, and unknown characters within words are counted on your character totals (except for character definitions). Radicals are counted as characters if you study them directly. (They are not counted anywhere if you just study them within other characters).

antti   December 13th, 2013 12:11a.m.

Continuing on the same topic, how do distinguish between definitions, readings, and writings?

In my current stats for characters, I have roughly 420 definitions and readings learned but about 1200 writings and tones learned. How come?

For words, it seems more consistent but still has considerable discrepancies. All four indicators are within 1040 and 1260.

nick   December 15th, 2013 2:24p.m.

Are you studying on the iOS app? Component character readings and definitions aren't studied/counted when you study their containing words, whereas writings and tones are. So your higher writing/tone counts reflect those component characters.

The word counts can be different depending on whether you toggled your parts/styles, are studying both simplified and traditional, or have added but not learned some of your words.

antti   December 15th, 2013 9:14p.m.

Yes, I'm on the iOS app.

I'm doing only the traditional and haven't tinkered with the settings. Anyway, the part about character definitions and readings makes sense now. Thanks for the clarification!

Alan   December 16th, 2013 7:28a.m.

I wrote a small web script that can check your Skritter vocabulary and tell you how many 'words' (including 1 character) you know: http://hskhsk.pythonanywhere.com/hanzi

The concept of a 'word' is vague enough in Chinese already, without confusing the issue further by saying that a word has to have more than 1 character... ;)

szhen   December 16th, 2013 8:19a.m.

hi Alan, how many skritter words/characters can the script analyze - it comes back with "play fair - too many words" when I export in, even though my skritter is quite small ...

Alan   December 16th, 2013 8:48a.m.

I put a 50,000 character limit in because I noticed a few people were using massive amounts of CPU time, uploading a corpus to the 'analyse a block of text' feature or something.

I've removed the limit for vocab lists, as it's likely you are pasting in your skritter vocab with all definitions (which pushes it over the limit) and the script is just ignoring all but the first column of Hanzi.

I also made the response a little more serious sounding, jokey error messages are always a bad idea (but you won't run into it now with vocab lists)

So maybe give it another try?

Edit: I've updated the script to be a bit clearer about how this count is performed; I was counting everything in the vocab list as a 'word', but that may not be true if you have included non-word characters in your vocab. I've intersected the vocab set that was input with the words in SUBTLEX-CH and CC-CEDICT, to give two different points of view on how many actual 'words' you know. So now the "Analysis of Words/Characters in Input" section looks something like:

Input contained:
147 unique single-character entries
582 unique multi-character entries <-- Skritter 'words'
729 unique entries
750 total entries
648 unique characters <-- Skritter 'characters'
8965 total characters
539 unique words as recognised by SUBTLEX-CH
548 unique words as recognised by CC-CEDICT

szhen   December 17th, 2013 6:19a.m.

thanks Alan - working for me - exactly right, was dropping in skritter export with english too! Love SUBTLEX-CH word count, CC-CEDICT a bit flattering (think it includes characters and character radicals that are not words).

Alan   December 20th, 2013 9:25a.m.

Yeah that makes sense- the SUBTLEX data is probably a much better source to use for what makes up that elusive concept of a 'word' in Chinese.

I'll leave the CC-CEDICT one on there too, might be useful for someone for a different reason

This forum is now read only. Please go to Skritter Discourse Forum instead to start a new conversation!