• The GDPR and Unstructured Data: Is Anonymisation Possible? 

      Weitzenboeck, Emily Mary; Lison, Pierre; Cyndecka, Malgorzata Agnieszka; Langford, Malcolm (Journal article; Peer reviewed, 2022)
      Much of the legal and technical literature on data anonymization has focused on structured data such as tables. However, unstructured data such as text documents or images are far more common, and the legal requirements ...
    • Named Entity Recognition without Labelled Data: A Weak Supervision Approach 

      Lison, Pierre; Barnes, Jeremy; Hubin, Aliaksandr; Touileb, Samia (Chapter, 2020)
      Named Entity Recognition (NER) performance often degrades rapidly when applied to target domains that differ from the texts observed during training. When in-domain labelled data is available, transfer learning techniques ...
    • skweak: Weak Supervision Made Easy for NLP 

      Lison, Pierre; Barnes, Jeremy; Hubin, Aliaksandr (Chapter, 2021)
      We present skweak, a versatile, Python-based software toolkit enabling NLP developers to apply weak supervision to a wide range of NLP tasks. Weak supervision is an emerging machine learning paradigm based on a simple idea: ...