Personal tools
A Network of Excellence forging the
Multilingual Europe Technology Alliance

Publications

 

T4ME METANET: Publications under WP1: Bringing more Semantics into Machine Translation

2010

  • C. Hardmeier and M. Federico (2010), "Modelling Pronominal Anaphora in Statistical Machine Translation." In Proceedings of the seventh International Workshop on Spoken Language Translation (IWSLT), Paris. (pdf, bib)
  • A. Tamchyna, O. Bojar (2010), “Bohatá anotace ve frázovém strojovém překladu.” In Informačné technológie – Aplikácie a Teória, Zborník príspevkov prezentovaných na konferencii ITAT, Seňa, Slovakia.
  • O. Bojar, N. Klyueva, J. Hajič (2010), “Czech-English-Russian Corpus for AppTek.” (url)
  • O. Smrž, J. Hajič (2010), “The Other Arabic Treebank: Prague Dependencies and Functions.” In Arabic Computational Linguistics, CSLI Publications, Stanford, CA, USA.  (pdf)
  • E. Hajičová, A. Abeillé, J. Hajič, J. Mírovský, Z. Urešová (2010), “Treebank Annotation.” In Handbook of Natural Language Processing, Second Edition, CRC Press, Taylor and Francis Group, Boca Raton, FL, USA , ISBN 978-1-4200-8592-1
  • E. Bejček, P. Hoffmannová, M. Holub, M. Hučínová, P. Pecina, P. Straňák, P. Šidák, J. Hajič (2010), “Lexikálně-sémantická anotace PDT pomocí Českého WordNetu.” (url)
  • M. Popel, Z. Žabokrtský (2010), “TectoMT: Modular NLP Framework.” In Lecture Notes in Computer Science, 6233, In Proceedings of the 7th International Conference on Advances in Natural Language Processing (IceTAL 2010), Berlin / Heidelberg.
  • K. Pala, T. Čapek, B. Zajíčková, D. Bartůšková, K. Kulková, D. Hlaváčková, P. Hoffmannová, E. Bejček, P. Straňák, J. Hajič (2010), “Český WordNet 1.9 PDT.” (url)
  • E. Bejček, N. Klyueva, P. Straňák, P. Šidák, E. Šťastná, P. Vimmrová, J. Hajič (2010), “Multiword Expressions in PDT 2.0.” (url)
  • O. Bojar, Z. Žabokrtský, J. Hajič (2010), “CzEng 0.9 for AppTek.” (url)
  • D. Mareček, M. Popel, Z. Žabokrtský (2010), “Maximum Entropy Translation Model in Dependency-Based MT Framework.” InProceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, Uppsala, Sweden.  (pdf)
  • M. Popel (2010), “English-Czech Machine Translation Using TectoMT.” In WDS 2010 Proceedings of Contributed Papers, Praha, Czechia. (pdf)
  • M. Popel, D. Mareček (2010), “Perplexity of n-gram and Dependency Language Models.” In Lecture Notes in Computer Science, 6231, In Text, Speech and Dialogue. 13th International Conference, TSD 2010, Brno, Czech Republic, September 6-10, 2010. Proceedings, Berlin / Heidelberg.
  • J. Hajič, P. Ircing, J. Romportl, J. Ptáček, S. Cinková (2010), “Senior Companion CZ.” (url)

2011

  • M. Federico, L. Bentivogli, M. Paul, S. Stüker (2011), "Overview of the IWSLT 2011 evaluation campaign." In Proceedings of the International Workshop on Spoken Language Translation, San Francisco. (pdf, bib)
  • N. Ruiz, A. Bisazza, F. Brugnara, D. Falavigna, D. Giuliani, S. Jaber, R. Gretter, M. Federico (2011), "FBK@IWSLT 2011." In Proceedings of the International Workshop on Spoken Language Translation, San Francisco. (pdf, bib)
  • C. Hardmeier, J. Tiedemann, M. Saers, M. Federico, P. Mathur (2011), "The Uppsala-FBK systems at WMT 2011." In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh. (pdf, bib)
  • N. Ruiz, M. Federico (2011), "Topic Adaptation for Lecture Translation through Bilingual Latent Semantic Models." In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh. (pdf, bib)
  • O. Bojar, A. Tamchyna (2011), “Improving Translation Model by Monolingual Data.” In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, UK. (url)
  • O. Hálek, R. Rosa, A. Tamchyna, O. Bojar (2011), “Named Entities from Wikipedia for Machine Translation.” In Information Technologies – Applications and Theory, Košice, Slovakia.
  • J. Hajič, P. Pajas, P. Ircing, J. Romportl, N. Peterek, M. Spousta, M. Mikulová, M. Grůber, M. Legát (2011), “Pražská databáze mluvené češtiny.” (url)
  • Z. Urešová (2011), “Valence sloves v Pražském závislostním korpusu.”Ústav formální a aplikované lingvistiky, Praha, Czechia.
  • O. Bojar, A. Tamchyna (2011), “Forms Wanted: Training SMT on Monolingual Data.” (url)
  • D. Zeman (2011), “Hierarchical Phrase-Based MT at the Charles University for the WMT 2011 Shared Task.” In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, UK.  (url)
  • V. Kettnerová, M. Lopatková (2011), “The Lexicographic Representation of Czech Diatheses: Rule Based Approach.” In Natural Language Processing, Multilinguality , Bratislava, Slovakia.
  • D. Zeman, M. Fishel, M. Popović, J. Berka, O. Bojar, S. Jaber, A. Bisazza, S. Hunsicker, M. Popel (2011), “Addicter 2.0.” (url)
  • Z. Urešová (2011), “Valence sloves v Pražském závislostním korpusu.”
  • E. Bejček, J. Panevová, J. Popelka, L. Smejkalová, P. Straňák, M. Ševčíková, J. Štěpánek, J. Toman, Z. Žabokrtský, J. Hajič (2011), “Prague Dependency Treebank 2.5.” (url)
  • D. Mareček, M. Popel, L. Ramasamy, J. Štěpánek, D. Zeman, Z. Žabokrtský, J. Hajič (2011), “HamleDT - HArmonized Multi-LanguagE Dependency Treebank.” (url)
  • M. Popel, D. Mareček, N. D. Green, Z. Žabokrtský (2011), “Influence of Parser Choice on Dependency-Based MT.” In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, UK.
  • Z. Urešová (2011), “Valenční slovník Pražského závislostního korpusu (PDT-Vallex).”Ústav formální a aplikované lingvistiky, Praha, Czechia.
  • Z. Žabokrtský, M. Popel, D. Mareček, T. Kraut (2011), “Treex::Core.” (url)
  • O. Bojar, M. Ercegovčević, M. Popel, O. F. Zaidan (2011), “A Grain of Salt for the WMT Manual Evaluation.” In Proceedings of the Sixth Workshop on Statistical Machine Translation, Edinburgh, UK. (pdf)
  • M. Fishel, O. Bojar, D. Zeman, J. Berka (2011), “Automatic Translation Error Analysis.” In Lecture Notes in Computer Science, 6836, (url)
  • B. Jawaid, D. Zeman (2011), “Word-Order Issues in English-to-Urdu Statistical Machine Translation.” In The Prague Bulletin of Mathematical Linguistics, 95, (url)
  • J. Hajič, P. Pajas, P. Ircing, J. Romportl, N. Peterek, M. Spousta, S. Cinková, M. Mikulová, J. Psutka (2011), “Prague Database of Spoken English.” (url)
  • P. Pořízka, M. Schäfer, D. Zeman (2011), “MorphCon.” (url)
  • J. Hajič, J. Mlynář (2011), “Archiv vizuální historie přístupný v Centru Malach.” In Archivní časopis, vol. 61, no. 4,
  • B. Hladká, A. Bémová, Z. Urešová (2011), “Syntaktická proměna Českého akademického korpusu.” In Slovo a slovesnost, 4,
  • E. Bejček, P. Straňák, D. Zeman (2011), “Influence of Treebank Design on Representation of Multiword Expressions.” In Lecture Notes in Computer Science, 6608, (url)
  • S. D. Larasati, V. Kuboň, D. Zeman (2011), “Indonesian Morphology Tool (MorphInd): Towards an Indonesian Corpus.” InCommunications in Computer and Information Science, 100, (url)
  • D. Zeman, M. Fishel, J. Berka, O. Bojar (2011), “Addicter: What Is Wrong with My Translations?” In The Prague Bulletin of Mathematical Linguistics, 96, (pdf)
  • O. Bojar, Z. Žabokrtský, O. Dušek, P. Galuščáková, M. Majliš, D. Mareček, J. Maršík, M. Novák, M. Popel, A. Tamchyna (2011), “CzEng 1.0.” (url)
  • V. Kuboň, M. Lopatková (2011), “Studying Properties of Czech Complex Sentences from an Annotated Corpus.” In Proceedings of the 24th International Florida Artificial Intelligence Research Society Conference (FLAIRS 2011), Menlo Park, CA, USA.
  • J. Hajič, E. Hajičová, J. Panevová, P. Sgall, S. Cinková, E. Fučíková, M. Mikulová, P. Pajas, J. Popelka, J. Semecký, J. Šindlerová, J. Štěpánek, J. Toman, Z. Urešová, Z. Žabokrtský (2011), “Prague Czech-English Dependency Treebank 2.0.” (url)

2012

  • A. Bisazza and M. Federico (2012), “Cutting the Long Tail: Hybrid Language Models for Translation Style Adaptation.” In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Avignon, France. (pdf, bib)
  • J. Berka, O. Bojar, M. Fishel, M. Popović, D. Zeman (2012), “Automatic MT Error Analysis: Hjerson Helping Addicter.” InProceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), İstanbul, Turkey.
  • M. Lopatková, P. Homola, N. Klyueva (2012), “Annotation of sentence structure: Capturing the relationship between clauses in Czech sentences.” In Language Resources and Evaluation, vol. 46, no. 1, (url)
  • V. Kuboň, M. Lopatková, M. Plátek (2012), “On Formalization of Word Order Properties.” In Lecture Notes in Computer Science, 7181, In Computational Linguistics and Intelligent Text Processing, 13th International Conference, CICLing 2012, Berlin / Heidelberg.
  • V. Kettnerová, M. Lopatková, E. Bejček (2012), “Mapping Semantic Information from FrameNet onto VALLEX.” In The Prague Bulletin of Mathematical Linguistics, 97, (url)
  • V. Kettnerová, M. Lopatková, E. Bejček (2012), “The Syntax-Semantics Interface of Czech Verbs in the Valency Lexicon.” InProceedings of the XV Euralex International Congress, Oslo, Oslo, Norway.
  • V. Kuboň, M. Lopatková, M. Plátek (2012), “Studying Formal Properties of a Free Word Order Language.” In Proceedings of the Twenty-Fifth International Florida Artificial Intelligence Research Society Conference, Palo Alto, California.
  • Z. Urešová (2012), “Building the PDT-VALLEX valency lexicon.” In Proceedings of the fifth Corpus Linguistics Conference, Liverpool, UK.
  • B. Vidová Hladká, Z. Urešová (2012), “Syntactic annotation of transcriptions in the Czech Academic Corpus: Then and now.” In Proceedings of the fifth Corpus Linguistics Conference, Liverpool, UK.
  • D. Zeman (2012), “Data Issues of the Multilingual Translation Matrix.” In Proceedings of NAACL 2012 Workshop on Machine Translation, Montréal, Canada.
  • D. Zeman, D. Mareček, M. Popel, L. Ramasamy, J. Štěpánek, Z. Žabokrtský, J. Hajič (2012), “HamleDT: To Parse or Not to Parse?” In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), İstanbul, Turkey.

 

T4ME METANET: Publications under WP2: Optimising the Division of Labour in Hybrid Translation


2011

  • E. Avramidis (2011), "DFKI System Combination with Sentence Ranking at ML4HMT-2011." In Proceedings of the International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT 2011) and of the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT), Barcelona, Spain, November. META-NET. (pdf, bib)
  • M. R. Costa-jussà and R. Banchs (2011), "A Rule-Based versus a Statistical-Based Machine Translation System in a Cross-Language Sentence Matching Application." (pdf, bib)
  • M. Farrús, M. R. Costa-jussà, J. B. Mariño, M. Poch, A. Hernández, C. Henríquez, and J. A. R. Fonollosa (2011), "Overcoming statistical machine translation limitations. Error analysis and proposed solutions for the catalan-spanish language pair." Language Resoures and Evaluation, 45(2):181–208. (doi, bib)
  • C. Federmann, Y. Chen, S. Hunsicker, and R. Wang (2011), "DFKI System Combination using Syntactic Information at ML4HMT-2011." In Proceedings of the International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT 2011) and of the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT), Barcelona, Spain, November. META-NET. (pdf, bib)
  • C. Federmann (2011), "Results from the ml4hmt shared task on applying machine learning techniques to optimise the division of labour in hybrid machine translation." In Proceedings of the International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT 2011) and of the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT), Barcelona, Spain, November. META-NET. (pdf, bib)
  • T. Okita and J. van Genabith (2011), "DCU Confusion Network-based System Combination for ML4HMT." In Proceedings of the International Workshop on Using Linguistic Information for Hybrid Machine Translation (LIHMT 2011) and of the Shared Task on Applying Machine Learning Techniques to Optimise the Division of Labour in Hybrid Machine Translation (ML4HMT), Barcelona, Spain, November. META-NET. (pdf, bib)

2012

  • E. Avramidis, M. R. Costa-jussà, C. Federmann, M. Melero, P. Pecina, and J. van Genabith (2012a), "A Richly Annotated, Multilingual Parallel Corpus for Hybrid Machine Translation." In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey. European Language Resources Association. (pdf, bib)
  • E. Avramidis, M. R. Costa-jussà, C. Federmann, M. Melero, P. Pecina, and J. van Genabith (2012b), "The ML4HMT Workshop on Optimising the Division of Labour in Hybrid Machine Translation." In Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey. European Language Resources Association. (pdf, bib)
  • M. R. Costa-jussà and R. Banchs (2012), "Automatic normalization of short texts by combining statistical and rule-based techniques. Language Resources and Evaluation, Special Issue Analysis of on short texts on the Web." ( bib)
  • M. R. Costa-jussà, M. Farrús, J. B. Mariño, and J. A. R. Fonollosa (2012), "Study and comparison of rule-based and statistical catalan-spanish machine translation systems." In Journal of the American Society for Information Science and Technology (JASIST), 31:1001–1026. (bib)
  • M. Farrús, M. R. Costa-jussà, and M. Popovic (2012), "Study and correlation analysis of linguistic, perceptual and automatic machine translation evaluations." Journal of the American Society for Information Science and Technology (JASIST), 63(1): 174–184. (pdf, bib)
  • C. Federmann, M. Melero, P. Pecina, and J. van Genabith (2012), "Towards Optimal Choice Selection for Improved Hybrid Machine Translation." The Prague Bulletin of Mathematical Linguistics, p. 5–22. (pdf, bib)
  • C. Federmann (2012), "Can machine learning algorithms improve phrase selection in hybrid machine translation?" In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, p. 113–118. Association for Computational Linguistics (ACL), European Chapter of the Association for Computational Linguistics (EACL), April. (pdf, bib)
  • T. Okita and J. van Genabith (2012), "Minimum bayes risk decoding with enlarged hypothesis space in system combination." In 13th International Conference on Intelligent Text Processing and Computational Linguistics (CI-CLING 2012): LNCS 7182 Part II, A. Gelbukh (Ed.), p. 40–51, New Delhi, India, Mar. Springer Berlin/Heidelberg. (pdf, bib)


Other Activities

 

Hytra workshop at EACL 2012

  • Hytra workshop is held at European Chapter of the Association for Computational Linguistics (EACL) which organized by M. R. Costa-jussà et al. A workshop webpage is available at http://www-lium.univ-lemans.fr/esirmt-hytra/.

ML4HMT workshop at LI4HMT 2011

ML4HMT workshop at COLING 2012

T4ME METANET: Publications under WP3: Exploiting the Context of Translation

2010

  • A. Tripathi, A. Klami, and S. Virpioja (2010), "Bilingual sentence matching using kernel CCA." In Proceedings of the 2010 IEEE
    International Workshop on Machine Learning for Signal Processing (MLSP 2010)
    , p. 130–135, Kittilä, Finland, August 2010. IEEE. (doi, bib)
  • S. Virpioja, A. Mansikkaniemi, J. Väyrynen, and M. Kurimo (2010), "Applying morphological decompositions to statistical machine-translation." In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, p. 201–206. Association for Computational Linguistics, July 2010. (pdf, bib)
  • M. Dobrinkat, T. Tapiovaara, J. Väyrynen, and K. Kettunen (2010), "Evaluating machine translations using mNCD." In Proceedings of the ACL 2010 Conference Short Papers, p. 80–85. Association for Computational Linguistics. (pdf, bib)
  • M. Dobrinkat, T. Tapiovaara, J. Väyrynen, and K. Kettunen (2010), "Normalized compression distance based measures for MetricsMATR 2010." In Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR, p. 343–348. Association for Computational Linguistics. (pdf, bib)
  • M. Dobrinkat and J. J. Väyrynen (2010), "Experiments with domain adaptation methods for statistical mt. From european parliament proceedings to finnish newspaper text." In Proceedings of the 14th Finnish Artificial Intelligence Conference STeP 2010, number 25 in Publications of the Finnish Artificial Intelligence Society, T. Pahikkala, J. Väyrynen, J. Kortela, and A. Airola (Eds), p. 31–38. Finnish Artificial Intelligence Society. (pdf, bib)
  • T. Vatanen, J. J. Väyrynen, and S. Virpioja (2010), "Language identification of short text segments with n-gram models." In Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC'10), p. 3423–3430. European Language Resources Association (ELRA), 2010. (pdf, bib)

2011

  • T. Honkela, J. Laaksonen, H. Törrö, and J. Tenhunen (2011), "Media map. A multilingual document map with a design interface." In Advances in Self-Organizing Maps - Proceedings of WSOM 2011, 8th International Workshop, p. 247–256. (doi, bib)
  • S. Virpioja, M.-S. Paukkeri, A. Tripathi, T. Lindh-Knuutila, and K. Lagus (2011), "Evaluating vector space models with canonical correlation analysis. Natural Language Engineering, to appear." Available on CJO 2011. (pdf, bib)
  • Lavergne, T., Le, H.-S., Allauzen, A., and Yvon, F. (2011), "LIMSI’s experiments in domain adaptation for IWSLT11." In M.-Y. Hwang and S. Stüker (eds.), Proceedings of the Eight International Workshop on Spoken Language Translation (IWSLT). San Francisco, CA. (pdfbib)
  • Freitag, M., Leusch, G., Wuebker, J., Peitz, S., Ney, H., Herrmann, T., Niehues, J., u. a. (2011), "Joint WMT Submission of the QUAERO Project." In Proceedings of the Sixth Workshop on Statistical Machine Translation, p. 358–364, Association for Computational Linguistics, Edinburgh, Scotland. (pdfbib)
  • Allauzen, A., Bonneau-Maynard, H., Le, H.-S., Max, A., Wisniewski, G., Yvon, F., Adda, G., et. al. (2011), "LIMSI @ WMT11." InProceedings of the Sixth Workshop on Statistical Machine Translation, p. 309–315, Association for Computational Linguistics, Edinburgh, Scotland. (bib)
  • Tomeh, N., Turchi, M., Wisniewski, G., Allauzen, A., and Yvon, F. (2011), "How Good Are Your Phrases? Assessing Phrase Quality with Single Class Classification." In M.-Y. Hwang and S. Stüker (eds.), Proceedings of the Eight International Workshop on Spoken Language Translation (IWSLT). San Francisco, CA. (pdfbib)
  • Le, H. S., Oparin, I., Messaoudi, A., Allauzen, A., Gauvain, J.-L., and Yvon, F. (2011), "Large Vocabulary SOUL Neural Network Language Models." In Proceedings of InterSpeech 2011. (bib)
  • Lardilleux, A., Lepage, Y., and Yvon, F. (2011), "The Contribution of Low Frequencies to Multilingual Sub-sentential Alignment. A Differential Associative Approach." International Journal of Advanced Intelligence3(2):189–217. (bib)
  • Crego, J. M., Yvon, F., and Mariño, J. B. (2011), "N-code. An open-source Bilingual N-gram SMT Toolkit." Prague Bulletin of Mathematical Linguistics96, p. 49–58. (bib)
  • Lavergne, T., Allauzen, A., Crego, J. M., and Yvon, F. (2011), "From n-gram-based to CRF-based Translation Models." InProceedings of the Sixth Workshop on Statistical Machine Translation, p. 542–553, Association for Computational Linguistics, Edinburgh, Scotland. (bib)
  • Gahbiche-Braham, S., Bonneau-Maynard, H., and Yvon, F. (2011), "Two Ways to Use a Noisy Parallel News Corpus for Improving Statistical Machine Translation." In Proceedings of the 4th Workshop on Building and Using Comparable Corpora. Comparable Corpora and the Web, p. 44–51, Association for Computational Linguistics, Portland, Oregon. (bib)
  • Tomeh, N., Allauzen, A., and Yvon, F. (2011), "Discriminative Weighted Alignment Matrices for Statistical Machine Translation." In M. Forcada and H. Depraetere (eds.), Proceedings of the European Conference on Machine Translation, p. 305–312, Leuven, Belgium. (bib)
  • Sokolov, A., and Yvon, F. (2011), "Minimum Error Rate Semi-Ring." In M. Forcada and H. Depraetere (eds.), Proceedings of the European Conference on Machine Translation, p. 241–248, Leuven, Belgium. (bib)
  • Tomeh, N., Allauzen, A., Lavergne, T., and Yvon, F. (2011), "Designing an Improved Discriminative Word Aligner." In A. Gelbukh (eds.), Proceedings of the 12th International Conference on Intelligent Text Processing and Computational Linguistics, CICLING. Waseda, Japan. (bib)

2012

  • M.-S. Paukkeri, J. Väyrynen, and A. Arppe (2012), "Exploring extensive linguistic feature sets in near-synonym lexical choice." In Proceedings of 13th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), vol. 7182 of Lecture Notes in Computer Science. Springer. (doi, bib)
  • Hai-Son, L., Lavergne, T., Allauzen, A., Apidianaki, M., Gong, L., Max, A., Sokolov, A., et. al. (2012), "LIMSI @ WMT12." InProceedings of the Seventh Workshop on Statistical Machine Translation, poster session. Montréal, Canada. (bib)
  • Freitag, M., Peitz, S., Huck, M., Ney, H., Niehues, J., Herrmann, T., Waibel, A., et. al. (2012), "Joint WMT 2012 Submission of the QUAERO Project." In Proceedings of the Seventh Workshop on Statistical Machine Translation, poster session. Montréal, Canada. (bib)
  • Apidianaki, M., Wisniewski, G., Sokolov, A., Max, A., and Yvon, F. (2012), "WSD for n-best reranking and local language modeling in SMT." Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation. Jeju Island, South Korea. (bib)
  • Zhuang, Y., Wisniewski, G., and Yvon, F. (2012), "Non-Linear Models for Confidence Estimation." In Proceedings of the Seventh Workshop on Statistical Machine Translation. Montréal, Canada. (bib)
  • Gahbiche-Braham, S., Bonneau-Maynard, H., Lavergne, T., and Yvon, F. (2012), "Joint Segmentation and POS Tagging for Arabic Using a CRF-based Classifier." Proceedings of the Language Resources and Evaluation Conference (LREC 2012). Istanbul, Turkey. (bib)
  • Sokolov, A., Wisniewski, G., and Yvon, F. (2012), "Computing Lattice BLEU Oracle Scores for Machine Translation." InProceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, p. 120–129. Association for Computational Linguistics, Avignon, France. (bib)
  • Allauzen, A., and Yvon, F. (2012), "Textual Information Access." In E. Gaussier and F. Yvon (eds.), Statistical Methods for Machine Translation, p. 223–304. ISTE/Wiley, Paris. (bib)

Other Activities

 

Material on Document Classification to improve Statistical Machine Translation

 

T4ME METANET: Publications under WP4: Empirical Base for Machine Translation

2010

  • Russo, I. (2010). "Discovering Polarity for Ambiguous and Objective Adjectives through Adverbial Modification." In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010, Valletta, Malta, 17-23 May 2010),  N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odjik, S. Piperidis, M. Rosner, D. Tapias (eds.), European Language Resources Association (ELRA), p. 1159 - 1163. (pdf, bib)
  • Stein, D., and Peitz, S., and Vilar, D., and Ney, H. (2010), "A Cocktail of Deep Syntactic Features for Hierarchical Machine Translation." In Conference of the Association for Machine Translation in the Americas 2010 (AMTA 2010), Denver, Colorado, USA, num. 9, p. 9. (pdf, bib)

2011

  • Peter, J., and Huck, M., and Ney, H., and Stein, D. (2011), "Soft String-to-Dependency Hierarchical Machine Translation." In International Workshop on Spoken Language Translation (IWSLT), San Francisco, California, USA, p. 246-253. (pdf, bib)
  • Russo, I. , Caselli, T., Rubino, F., Boldrini, E., Martínez-Barco, P. (2011). "EMOCause: An Easy-adaptable Approach to Extract EmotionCause Contexts." In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (WASSA 2011), p. 153-160. (pdf, bib)

2012

  • Caselli, T., Russo, I., Rubino, F.  (2012). "Assigning Connotation Values to Events." In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012, Istanbul, May 23rd-25th, 2012), N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, J. Odijk and S. Piperidis (eds.),  p. 3082-3089. (pdf, bib)
  • Huck, M., and Peitz, S., and Freitag, M., and Ney, H. (2012), "Discriminative Reordering Extensions for Hierarchical Phrase-Based Machine Translation." In 16th Annual Conference of the European Association for Machine Translation (EAMT), p. 313-320, Trento, Italy. (pdf, bib)

CNR

As a part of the task T4.3 (with forthcoming M36 deliverable D4.3) we worked on a method to automatically identify linguistic contexts which contain possible causes of emotions or emotional states in Italian, combining relevant linguistic patterns and an incremental repository of common sense knowledge on emotional states and emotion eliciting situations. This approach has been evaluated with respect to manually annotated data. The results obtained are satisfying and support the validity of the methodology proposed. This work resulted in an oral presentation at WASSA 2011 and in a publication (see below).
Extending results from the previous paper, we produced a repository of event nouns with associated weighted polarity values. In particular, we are able to amend SentiWordNet values assigning to event nouns scores compatible with their connotations. This work is going to be presented at LREC2012 in May and it will be published in LREC2012 proceedings.

Produced dataset: in Caselli et al. (to appear) first release of Italian event nouns repository with associated connotational values.
Software prototypes: none
Developed methods: guidelines for the annotation of emotion cause in context.
Presentations at workshops and conferences: oral presentation at WASSA 2011, ACL HLT 2011 workshop ("EMOCause: An Easy-adaptable Approach to Extract Emotion Cause Contexts"). "Assigning Connotation Values to Events" will be presented at LREC2012.

Exchange and co-operation with other disciplines: Part of the manually annotated corpus used in Russo et al. (2011) has been annotated by students of linguistics at University of Pavia. During a seminar in May 2011 they were instructed about the annotation scheme and, through annotation, provided input for further revision of the annotation guidelines.

What we are working on: Among the objectives there is the extension of the eventive nouns sentiment lexicon to other languages, through WordNet mappings, and the inclusion of verbs in the Italian repository, to provide complete information about subjective values of event denoting words."

Material on Document Classification to improve Statistical Machine Translation