Machine translation software has become heavily integrated into our daily lives due to the recent improvement in the performance of deep neural networks. However, machine translation software has been shown to regularly return erroneous translations, which can lead to harmful consequences such as economic loss and political conflicts. Additionally, due to the complexity of the underlying neural models, testing machine translation systems presents new challenges. To address this problem, we introduce a novel methodology called PatInv. The main intuition behind PatInv is that sentences with different meanings should not have the same translation. Under this general idea, we provide two realizations of PatInv that given an arbitrary sentence, generate syntactically similar but semantically different sentences by: (1) replacing one word in the sentence using a masked language model or (2) removing one word or phrase from the sentence based on its constituency structure. We then test whether the returned translations are the same for the original and modified sentences. We have applied PatInv to test Google Translate and Bing Microsoft Translator using 200 English sentences. Two language settings are considered: English-Hindi (En-Hi) and English-Chinese (En-Zh). The results show that PatInv can accurately find 308 erroneous translations in Google Translate and 223 erroneous translations in Bing Microsoft Translator, most of which cannot be found by the state-of-the-art approaches.
Thu 12 Nov Times are displayed in time zone: (UTC) Coordinated Universal Time change
08:00 - 08:30: ML Testing 2Paper Presentations / Journal First / Research Papers / Tool Demos / Visions and Reflections at Virtual room 2 | |||
08:00 - 08:02 Talk | DeepSearch: A Simple and Effective Blackbox Attack for Deep Neural Networks Research Papers DOI | ||
08:03 - 08:04 Talk | Machine Learning Based Test Data Generation for Safety-critical Software Paper Presentations Ján ČegiňFaculty of Informatics and Information Technologies Slovak Technical University | ||
08:05 - 08:06 Talk | Machine Learning Testing: Survey, Landscapes and Horizons Journal First Jie M. ZhangUniversity College London, UK, Mark HarmanUniversity College London, UK, Lei MaKyushu University, Yang LiuNanyang Technological University, Singapore | ||
08:07 - 08:08 Talk | Machine Translation Testing via Pathological Invariance Research Papers Shashij GuptaIIT Bombay, India, Pinjia HeETH Zurich, Switzerland, Clara MeisterETH Zurich, Switzerland, Zhendong SuETH Zurich DOI | ||
08:09 - 08:10 Talk | Model-Based Exploration of the Frontier of Behaviours for Deep Learning System Testing Research Papers DOI | ||
08:11 - 08:12 Talk | PRODeep: A Platform for Robustness Verification of Deep Neural Networks Tool Demos Renjue LiInstitute of Software at Chinese Academy of Sciences, China, Jianlin LiInstitute of Software at Chinese Academy of Sciences, China, Cheng-Chao HuangInstitute of Intelligent Software, China, Pengfei YangInstitute of Software at Chinese Academy of Sciences, China, Xiaowei HuangUniversity of Liverpool, Lijun ZhangInstitute of Software, Chinese Academy of Sciences, Bai XueInstitute of Software at Chinese Academy of Sciences, China, Holger HermannsSaarland University DOI | ||
08:13 - 08:14 Talk | Testing Machine Learning Code using Polyhedral Region Visions and Reflections Md Sohel AhmedNational Institute of Informatics, Japan, Fuyuki IshikawaNational Institute of Informatics, Mahito SugiyamaNational Institute of Informatics, Japan DOI | ||
08:15 - 08:30 Talk | Conversations on ML Testing 2 Paper Presentations Fuyuan ZhangMPI-SWS, Germany, Ján ČegiňFaculty of Informatics and Information Technologies Slovak Technical University, Mark HarmanUniversity College London, UK, Renjue LiInstitute of Software at Chinese Academy of Sciences, China, Shashij GuptaIIT Bombay, India, Vincenzo RiccioUSI Lugano, Switzerland, M: Shin YooKorea Advanced Institute of Science and Technology |