[users] Language tagging does not always tell the truth
- ๐ง Messages: 2
- ๐ฃ๏ธ Authors: 2
- ๐
First Message: 2021-03-30 07:03
- ๐
Last Message: 2021-03-30 07:16
1. Stephane Bortzmeyer (stephane (a) sources.org)
- ๐
Sent: 2021-03-30 07:03
- ๐ง Message 1 of 2
I was happy to see our first capsule in chinese but the language
tagging ('zh-TW') is misleading, all the texts are in
english. Strange.
Link to individual message.
2. Petite Abeille (petite.abeille (a) gmail.com)
- Subject Changed! New Subject: Re: [users] Language tagging does not always tell the truth
- ๐
Sent: 2021-03-30 07:16
- ๐ง Message 2 of 2
> On Mar 30, 2021, at 09:03, Stephane Bortzmeyer <stephane@sources.org> wrote:
>
> but the language tagging ('zh-TW') is misleading, all the texts are in english
franc: detect the language of text
https://github.com/wooorm/franc/tree/main/packages/franc
# '/usr/local/bin/franc' --ignore glg,vec --min-length 256 <
'04.content.utf.txt' 2>/dev/null
ยฑ0ยข
Link to individual message.
---
Previous Thread: gemini.circumlunar.space seems outdated
Next Thread: RE: laika.lk certificate