• Tensorflow & Pytorch
  • Python
  • Flask
  • Three.js
  • Figma & Vectary

Through an educational learning hub and an interactive playground, both hosted on the same web app, Paralang aims to cultivate literacy, inspire curiosity, and arouse concern with respect to emerging neural language models.

Recently released, state of the art language models have been shown to be able to produce text that is nearly indistinguishable from that produced by humans.

These recent advances, which have proved plenty controversial within machine learning circles, have caused ripples in the general media landscape as well, where coverage has been largely hyperbolic, excessive, and occasionally uninformed or even incorrect.

With the belief that this natural language generation technology, more than mere novelty, will gradually assume a more and more pervasive role in our everyday lives, I wanted to intervene, however modestly, and provide an accessible, beginner-friendly platform to help secularize this technology and elaborate on some of its inner-workings as well as its repercussions both for us as individuals and a society. I’d like to help answer questions like: what makes these recent advances so compelling and new? Or: how might existing societal problems by reproduced and reinforced by these advanced language models?

Ultimately, my aim is to help cultivate a more level-headed literacy as well as inspire both a sense of informed curiosity and concern with respect to these emerging models and their ramifications, with an emphasis on the recent and state of the art (particularly Google’s BERT and OpenAI’s GPT-2).

The platform consists of two components, both hosted on a single web app. One is educational, revolving around a learning hub, glossary, and resources curated for all skill levels — newcomer, intermediate, and advanced. The other is interactive, comprising of a “playground” encouraging hands-on experimentation with some of the language models featured in the educational component.

Altogether, the platform is built to accommodate non-linear engagements — users can begin with the learning hub and progress through to the playground, or simply jump to the playground, or maybe even just skip around between glossary and resources.

Here are some examples of slides from my learning hub section. The project is still an ongoing one, although the field is one that is highly exceptional in the velocity with which it barrels ahead. A state of the art model that was released mere months ago is overcome by a new one owing to a new methodology of pretraining (autoregressive outpacing autoencoder methods towards late in June 2019, for example). And new and ever more accessible tools are being released that alter the conversation with regard to who can take advantage of these technologies, as well as for what. While ordinarily I'm sympathetic to the argument that there is still plenty to be gleaned from obsolescent information and exposition about a certain field, providing it is sufficiently developed, well researched, and caringly laid out, owing to the fact that the two primary focuses of my project (specific architectures and societal ramifications) are too frequently changing shape, an outmoded, unmaintained resource like this one would run the risk of painting a simply inaccurate picture of what's going on.

While keeping up to date and maintaining this platform as mapped out in my prototype is untenable, at least for a single person who can't make this their first or second preoccupation, I most certainly plan on launching an abbreviated version of my platform where users can browse resources (which I'm always keeping up to date with anyway) and perhaps also experiment in the playground section as well.