I have set languages for posts as persian and english and un-selected undetermined.
but I still get post in other languages and after open setting I see that undetermined option is pre-selected. it seems that I can’t disable it. any idea why?
I have set languages for posts as persian and english and un-selected undetermined.
but I still get post in other languages and after open setting I see that undetermined option is pre-selected. it seems that I can’t disable it. any idea why?
for the life of me I can’t make it be unselected. try it for your self. from web page select setting and then de-select the undetermined and select a language (for example english) and then press save at the bottom. come back to setting page and see undetermined be selected again. the English or other languages you selected are saved correctly but for the life of me I cant disable undetermined.
it pollutes my feed with german french and other languages that I don’t understand.
Honestly, it might be better to change the feature from how it works today, where humans select the language type, to do something like having either the instance or client try to infer the language type and do the filtering there. I can tell you that a huge amount of the content that I want to see doesn’t have people explicitly marking the language. Heck, the comment I responded to isn’t marked as English.
There’s some Linux utility or library that does statistical guessing of language based on characters seen. Probably also more sophisticated stuff out there. Lemme see if I can dig it up.
hunts around a bit
Well, this isn’t it, but here’s a Python module. On Debian trixie:
So it’d be 99.999% confident that your username is Arabic. Something like PieFed or Lemmy or a client could make use of that. Maybe use some heuristics a bit to default to assuming that the language is the same as the language of the parent comment or post or community average language or something, since very short comment texts might be unclear or ambiguous.
That’s not perfect, because sometimes people will quote stuff in other languages or something like that, but I’d wager that it’d be more accurate than manually-tagged stuff.