Jalaj P. Jha Technical & Miscellaneous Ramblings

12Apr/0722




Storing Unicode Data in Databases

This post is in continuation to my earlier post VB Forms in Hindi Unicode Fonts following a few comments regarding improper display of stored data.

Let's first use an MS Access Database (referred in code as D:uni.mdb), containing a single table "unitable" which itself contains only one text field "unistring".

The code for storing data to tables and retrieving them are referred from Database - How can I add a record to a database? and Database - How can I fill a combobox/Listbox with values in a database?, which may be referred for more info on relevant task.

hindidb.jpg

As visible above, the form contains a text box ('txtNewString') for user Input, a button ('cmdAdd') labelled 'Add', on clicking which the text from the textbox is inserted into the table.

Another button ('cmdRetrieve') labelled 'Retrieve', on press, retrieves all the data from the table to populate the listbox ('lstRetrievedData').

As also discussed in VB Forms in Hindi Unicode Fonts, these controls should be capable of accepting and displaying Unicode Data, which the default VB controls are not capable of. However, corresponding controls available with "Microsoft Forms 2.0 Object Library" allow unicode data for which just right-click the toolbox, click on 'components...' and check "Microsoft Forms 2.0 Object Library". Now use these controls instead.

The code below demonstrate adding and retrieving data from table on press of relevant buttons. Go through the code... nothing new!! all that you would have written for English based forms...

Private Sub cmdAdd_Click()

    Dim oADOCon As ADODB.Connection

    Dim strQuery As String

    Set oADOCon = New ADODB.Connection
    oADOCon.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=d:\uni.mdb;User ID=;Password=;"

    strQuery = "insert into unitable values('" & Form1.txtNewString & "')"
    oADOCon.Execute strQuery

    txtNewString = ""

    Set oADOCon = Nothing

End Sub

Private Sub cmdRetrieve_Click()

    Dim oADOCon As ADODB.Connection
    Dim oADORS As ADODB.Recordset

    Dim strQuery As String

    Set oADOCon = New ADODB.Connection
    Set oADORS = New ADODB.Recordset

    oADOCon.CursorLocation = adUseClient

    oADOCon.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=d:\uni.mdb;User ID=;Password=;"

    strQuery = "SELECT unistring FROM unitable"
    oADORS.Open strQuery, oADOCon, adOpenForwardOnly, adLockReadOnly, adCmdText

    lstRetrievedData.Clear

    Do While Not oADORS.EOF

        lstRetrievedData.AddItem oADORS.Fields("unistring")
        oADORS.MoveNext

    Loop

    Set oADOCon = Nothing
    Set oADORS = Nothing

End Sub

Work's fine! Now let's try the above code with an MS-SQL database... Create a similar table in an existing database or to a new database (I used existing Northwind database) with a single text field.

The code should normally work by making changes to line(s) below.

    oADOCon.Open "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=d:\uni.mdb;User ID=;Password=;"
    oADOCon.Open "DSN=YourDSN;database=Northwind;uid=user;pwd=pwd"

But it doesn't and the text is saved as ?????? and retrieved as such, since the data types char, varchar, text are designed to accept only 8-bit characters and not the unicode ones. Data types nchar, nvarchar and ntext corresponding to the previously mentioned data types do however allow unicode characters, so we will have to ensure that the data type of the single field in our table is changed accordingly. 'n' in these data types is elaborated by MySQL as 'National'.

That's not all!, Now that we have made arrangents for correct storage at the database level, we also need to ensure that the data sent by us is recognized by the database as Unicode (or National, as described by MySQL). For this we need to change the insert script by adding a single character 'N' just before the unicode string as :

    strQuery = "insert into unitable values('" & Form1.txtNewString & "')"
    strQuery = "insert into unitable values(N'" & Form1.txtNewString & "')"

That's all. By now you can create database Apps for storing hindi text using MS Access, MS-SQL (both tested) and MySQL (not tested, but documentation seems to support my belief). You can also try other SQL servers also. I would later sometime test this with Oracle and if any issue found, would post it.

Comments (22) Trackbacks (3)
  1. Excellent Sample!! Great Job!!!Thanks very much!!! I shall refere this article to many.

    Thanks very Much once again!!!

  2. Hello

    Well Im developing software for Middle East(Arabic). Im using VB 6.0, the requirement is some fields (textbox) are of english and some are ment for arabic…on the same form, I want to change the locale(input) automatically when it comes to Arabic Field and vice verse.

    How can I do it??

    If you feel this question is out of topic to this article kindly move it!!(sorry!!) but please get me a word by my email.

    regards
    phil

  3. Hi Phil,

    There is nothing such as out of topic… your requirement is genuine… we do/may get such requirements… I will revert asap

  4. Excellent article. But whenever I tried to save the unicode data in oracle DB, it saves as ??? while for access DB and SQL server, it works fine. Could you pl give any solution why it is saving ??? instead of actual unicode data

  5. Very Informative Information.
    As Shaliendra raises the question that when we are trying to do the same in oracle database, it is storing as ???. The Field datatype in the database in NVARCHAR. Could you pl give any solution why it is like this.

  6. @Amit & Shailendra,

    You may refer http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96529/ch6.htm to check what you are missing out.

    The link above confirms “N” added before the field data works as it did with MS-SQL and MySQL. There is also a TO_NCHAR Function that you may try using.

    Example 6-1 Populating the Customer Table Using the TO_NCHAR Function
    The TO_NCHAR function converts the data at run time, while the N function converts the data at compilation time.

    INSERT INTO customers VALUES (1000,
    TO_NCHAR(‘John Smith’),N’500 Oracle Parkway’,sysdate);

    The above link is a good source of relevant information. Hope it could help you.

  7. how to insert unicode data using c#. pls provide c# code

  8. PHP and Mysql
    insert into table1 values(1,N’bc’);
    successfully insert into the table1 but the problem is value insert using N cannot be retrieve correctly using PHP and Mysql

  9. Wonderful..Thanks for the solution

  10. This is good. But how can we input hindi characters in the textbox. Is there any way ?

  11. Thanks a lot!
    Very clear and useful.

  12. When I try to insert unicode text in MySQL database using c# still ????? inserted into database. I try by using prefix N but still not working.

  13. Hi
    My problem is that i am unable to use the method mentioned in link given in above post to add combobox items from data base. I am trying to use Microsoft Forms 2.0 Object Library ComboBox and when i try to add item using the method mentioned in above give link it gives error that there is no ItemData method with which i can add values for ComboBox items. Can you provide any help in this regard??

  14. Very Helpful to me!!! Thank you very much.

  15. finally i got wht i want and it was so simple, thanks for the article

  16. Thanks a lot!
    I am writing an VB 2008 application and using Cyrillic characters. That capital N in front of quoted string saved the day!

  17. The code provided works fine with microsoft access but when it comes to mysql or sql server. its just putting ??? marks in Database.

  18. The code provided talks about updating the unicode chars… how about retrieving values from table in access and displaying in vb text box without “????”

  19. hey,

    I have one minor problem… that i have to insert unicde data into access database using C#.Net, but i am not capable to insert marathi(unicode) data..into access will u help me plz..

  20. Hi everybody i have little diffrent problem………

    I have UNICODE in my database manually entered how to retrieve the Urdu UNICODE…….. from MS Access it retrieves as [????]

    or i have to configure UNICODE column to 8-bit or something ………..Data type is text in which UNICODE is stored

    Thanks

  21. In VB it looks easy , well i am trying it with PHP, and completed with MySql , and now on MS Access , its bit slow, if you know tell me about storing Unicode data into MS Access using PHP(ODBC),
    Thank you,
    Stylus .


Leave a comment