Computer Linguist(s) (M/F) – 40-100%
Place of employment: City of Zurich
Hieronymus Ltd provides high quality legal translation services for law firms and legal departments of banks and large Swiss corporations. Since early 2020, Hieronymus Ltd has also offered the first neural machine translation engine specialising in Swiss law: LEXMachina. Although the initial results are already extremely promising, we intend to continue improving this technology in order to produce increasingly accurate translations. For this purpose, we are seeking to hire one or more Computer Linguist(s) interested in working in the field of legal linguistics and helping to develop dedicated neural translation engines.
As a Computer Linguist for LEXMachina, you will be responsible for the entire data processing pipeline and improvements on the LEXMachina engines. More specifically, your duties will involve developing scraping scripts to gather content from websites, performing optical character recognition (OCR) (and customizing OCR tools) for text extraction in order to convert (and automate the conversion of) certain content into standardised formats (e.g. PDFs to XML or DOCX), cleaning the selected documents to feed the engine (post-processing of the converted texts) and improving our NLP tool for categorising those documents according to their legal field. You will also develop and implement algorithms to pair texts available in multiple languages (document pairing), segment them, and align them to create parallel corpora that can be fed into LEXMachina. Finally, you will evaluate the corpora created in this way to identify any problems. With the continual aim of improving LEXMachina, you will analyse feedback from users and find appropriate solutions for incorporating the requested changes into the system, for example by defining pre-editing or post-editing rules, in collaboration with our NMT partner. Finally, in addition to constantly improving the "baseline" engine, you will help to develop new engines in various areas of law, finance and insurance.
- Computer Linguist with a master's degree in computational linguistics, computer science, engineering, etc. or in the process of studying for one
- Passion for languages and strong interest for the legal and financial field
- Good knowledge of English, German, French (Italian a plus)
- Excellent analytical skills
- Good understanding of DTP and digital publishing processes (HTML, CMS, CSS, HTML Markup, boilerplate, tags, encodings, etc.)
- Familiarity with OCR tools and processes (Abbyy Fine Reader, tesseract, etc.), familiarity with publishing-related tools and data formats a plus (Adobe Indesign, Acrobat, QuarkXPress, images)
- Experience with command line and Python, PERL, BAT (shell scripting a significant plus)
- Understanding of Unix/Linux operating systems, as well as Windows servers
- Familiarity with translation industry and CAT tools a plus (XLIFF, SDLTM, TMX, CSV, etc.)
What we can offer you:
- The opportunity to play an active role in the development of the first Swiss legal translation engine and to contribute to technological advances in this area
- Work in close collaboration with expert lawyer-linguists with a well-honed eye for quality and accuracy
- Various trips (outside Switzerland) to our NMT partner who is one of the European market leaders in the field of neural translation and research in the topic
- A varied, interesting and challenging role
- The ability to organize and participate in various neural translation-related events (conferences, hackathons, etc.)
- A pleasant working atmosphere within a small, highly motivated team
The start date and salary are to be agreed upon with the candidate(s) based on their level of training and their experience. Please submit your application by email to: firstname.lastname@example.org (max. 2MB). We look forward to receiving it.