An AI-powered Baybayin translator? UP mathematicians are developing one
Filipino mathematicians from the University of the Philippines (UP) recently developed a computerized approach that can convert the Baybayin writing system into Latin-character or understandable text.
According to the UP Diliman-College of Science Institute of Mathematics (UPD-CS IM), the scientists, through mathematics and technology, made "what is likely the world's first paragraph-level optical character recognition (OCR) system that can distinguish between entire blocks of Baybayin and Latin characters in a text image."
Called "Block-level Optical Character Recognition System for Automatic Transliterations of Baybayin Texts Using Support Vector Machine," the system that was that was run through a support vector machine (SVM) character classifier was developed by Masters student Rodney Pino and associate professors Dr. Renier Mendoza and Dr. Rachelle Sambayan.
"SVM is a machine learning algorithm used to solve regression or classification problems," said Pino.
"We have a dataset for Baybayin characters — let's say character A and then character BA. SVM uses techniques or mathematical methods that can separate the two datasets to determine characters BA and A," he added.
The group took more than three months to collect over a thousand images for each Baybayin character, gathering a total of 110 paragraphs from different websites that have either hand- or typewritten Baybayin, Latin, or Baybayin and Latin writing to improve the recognition rate of SVM.
Still, the scientists are trying to make the OCR system more aware of the context of Baybayin words and phrases to be a full-fledged translator and make the system work both ways, with the ability to convert Latin words with foreign sounds into Baybayin.
"We're trying to refine the software we developed to make it easier for future users to navigate it. We also dream of creating a mobile application that automatically and accurately translates Baybayin characters just by hovering over the phone," Dr. Mendoza said.
But the system still can't distinguish between some Baybayin characters that are similar in writing, such as E and I, and O and U.
As the interest in and research on Baybayin is slowly increasing, the scientists are hopeful to see Filipinos become interested in protecting Baybayin through research.
According to the scientists, Baybayin is living proof that we Filipinos have our technically-sophisticated traditions, making it important to have a record of each Baybayin character and even digitized ones.
"We're hoping that through this OCR system, we could preserve and pass on the knowledge of understanding Baybayin to future Filipino generations," Dr. Sambayan said. — Sherylin Untalan/LA, GMA Integrated News