CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Issues related to the CMS tools of WYSIWYG Web Builder.
Forum rules
PLEASE READ THE FORUM RULES BEFORE YOU POST:
viewtopic.php?f=12&t=1901

MUST READ:
http://www.wysiwygwebbuilder.com/cms_tools.html
A lot of information about the Content Manager System can be found in the help/manual. Please read this first before posting any questions! Also check out the demo template that is include with the software.

CMS trouble shooting / FAQ:
viewtopic.php?f=10&t=43245
Post Reply
bigdenis
 
 
Posts: 8
Joined: Sat Mar 02, 2019 11:09 am

CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis » Sat Mar 02, 2019 1:36 pm

Dear Pablo!

I'm from Vojvodina in North of Serbia where are many spoken languages, therefore sites are multilingual. I have difficulties with CMS Search. Everything seems to be ok but the CMS Search is working strange. I'm illustrating this with your CMS demo project. There is no custom code. I turned on unicode support for CMS Admin and CMS View. The editor is the newest CKEditor 4. Database collation is utf8_general_ci, charset is utf8.

http://www.ntesla.edu.rs/cmseredeti/index.html

Strange behavior with Hungarian language: search can find some of the words but not every word
Strange behavior with Serbian Cyrillic language: search can't find any words
Strange behavior with Serbian Latin language: search can find some of the words but not every word

What is your advice?

I saw your CMS demo at http://www.wysiwygwebbuilder.com/suppor ... php?page=4 and there is everything ok with CMS Search in Cyrillic article (CMS Search is finding EVERY word from article). The difference is that there is a belorussian language and therefore maybe the database configuration is different?

I have completed a new web site with CMS tools and noticed the above mentioned things.
http://www.ntesla.edu.rs/sr_intro.html
In this site I used CMS Tools in 3 places:
http://www.ntesla.edu.rs/prosveta/sr_dogadjaji.php
http://www.ntesla.edu.rs/informacije/sr_dok_skole.php
http://www.ntesla.edu.rs/informacije/sr ... abavke.php
(Of course with the other language too - with hungarian prefix hu_ in he page names)

I think the simplest way to find out the error is thru your CMS Demo project presented by me with almost nothing to changed in it. If you still want my project source based on your CMS Demo tell me and I will upload it somewhere.

Thanks in advance!

User avatar
Pablo
 
Posts: 16001
Joined: Tue Mar 28, 2006 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo » Sat Mar 02, 2019 1:48 pm

Are you sure the database is configured as UTF8/unicode?
Do you see the search words in the database?

bigdenis
 
 
Posts: 8
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis » Sat Mar 02, 2019 5:13 pm

According to the phpMyAdmin screenshot, the database is configured as utf-8:
Image

There are search words in the database on 3 languages...
Serbian cyrillic:
Image

Serbian Latin:
Image

Hungarian:
Image

The screenshot is not showing all of them of course, those would be bigger images...
(But the word "rnrnФранцуз" is not the real word. "Француз" IS the real word.
"rnrnsrbi" isn't a real word, but "Srbi" IS.)

Searching for words from the scrrenshots:
Serbian Cyrillic - word "индоевропских" is in the word list in the database, but the CMS Search can't find it in the article.
Serbian Latin - word "Mađarskoj" is in the word list in the database, but the CMS Search can't find it in the article.
Hungarian - word "anyanyelvűek" is in the word list in the database, but the CMS Search can't find it in the article.

User avatar
Pablo
 
Posts: 16001
Joined: Tue Mar 28, 2006 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo » Sat Mar 02, 2019 5:59 pm

I'm sorry, I don't think I can help you with this.
The CMS script may be incompatible with these languages. Although you are the first user that has reported issues with this.

bigdenis
 
 
Posts: 8
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis » Sun Mar 03, 2019 10:03 am

Sorry to hear that. Search among the articles is important thing. In the meantime I figured out that CMS Search...
...on hungarian is working correctly with words WITHOUT the national characters éáűúőóüí
...on serbian latin is working correctly with words WITHOUT the national characters čćžđš
...on serbian cyrillic is not functional at all

I'm not an expert but I read a little and found this article on the internet among others with the same content:
https://mathiasbynens.be/notes/mysql-utf8mb4
It's about full unicode support and using utf8mb4 instead of utf8. I'm not sure but can you reconsider your CMS script to implement full unicode support for above mentioned languages if I ask you politely? It may be as option for developers to chose through development interface. (like existing checkbox/properties for unicode support in CMS Admin and CMS View).

Maybe the UTF-16 is the solution? I'm asking this because I have other difficulties with hungarian/serbian latin/serbian cyrillic in the tables when I'm importing content from CSV file.
In the case of pure table importing cyrillic text from utf-8 csv I'm seeing incorrect characters, but when I importing from utf-16 csv the content is appearing correctly - sadly at first attempt for editing it is changing to garbage graphic characters.
When I'm using your Responsive Data Table extension utf-8 csv cyrillic import is incorrent to, but utf-16 csv cyrillic export is ok and the characters are shown correctly. I will report this anomaly in the right place and category on this forum. I mentioned it here because of the same problem of character set coding.

Dear Pablo. Please help me to solve this/these problem(s). If not now then in the future releases. I'm going to make other multilingual projects with CMS tools.

Thanks in advance!

User avatar
Pablo
 
Posts: 16001
Joined: Tue Mar 28, 2006 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo » Sun Mar 03, 2019 10:40 am

I think the CMS script is Unicode compliant. I have tested it with different Unicode languages.
The documentation you are referring is all database configuration related.
Unfortunately, I cannot help you with the configuration of the server.

bigdenis
 
 
Posts: 8
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis » Sun Mar 03, 2019 12:01 pm

Dear Pablo!

I'm accepting your verdict, but let me present some other moments.
Once more, I'm not expert and I'm not pretending to be. I'm trying only to use my own brain...
Before I wrote the previous post to you I tried some things about database configuration of course.

First attempt:
Instead of pure utf-8 charset and collation I create it with utf8mb4 charset and collation. In the same time I tried to correct your generated script also to support utf8mb4 via search/replace. Not succeded, probably I messed up something, you are the master with your scripts.

Second attempt:
Instead of pure utf-8 charset and collation I create it with utf16 charset and collation. In the same time I tried to correct your generated script also to support utf16 via search/replace. As a result I saw chinese characters, therefore not succeded, once again you are the master of your own scripts.

I'm very sad now...Anyway, thanks for your patience reading my posts...

User avatar
Pablo
 
Posts: 16001
Joined: Tue Mar 28, 2006 12:00 pm
Location: Europe
Contact:

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by Pablo » Sun Mar 03, 2019 12:40 pm

To see if the script supports the characters, I have added the words "Mađarskoj" and "anyanyelvűek" to the test page:
http://www.wysiwygwebbuilder.com/suppor ... php?page=4

As you can see this seems to work correct, so this indicates that the script works for these languages.

bigdenis
 
 
Posts: 8
Joined: Sat Mar 02, 2019 11:09 am

Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages

Post by bigdenis » Sun Mar 03, 2019 1:01 pm

Yes, thank you, you are right!! This is crystal clear now. Something with database configuration, but what...I'm going crazy...spent days to figure out.
Thanks again for giving me a fix point in further investigations! I'm fond of WYSIWYG Web Builder and I'm planning to be a long rider with it.

Post Reply