June 2021 – eduardofv

Check on Github

This is a small framework aimed to make easy the evaluation of Language Models with the STS Benchmark as well as other task-specific evaluation datasets. With it you can compare different models, or versions of the same model improved by fine-tuning. The framework currently use STSBenchmark, the Spanish portion of STS2017 and an example of a custom evaluation dataset.

The framework wraps models from different sources and runs the selected evaluation with them, producing a standarized JSON output.

Models can be sourced from:

Main Goal: Extension to other evaluation datasets

The main goal of this framework is to help in the evaluation of Language Models for other context-specific tasks.

Evaluation Results on current datasets

Check this notebook for the current results of evaluating several LMs on the standard datasets and in the context-specific example. This results closely resembles the ones published in PapersWithCode and SBERT Pretrained Models

STSBenchmark

STS-es Spanish to Spanish Semantic Textual Similarity

Check on Github

Check at Gihub

Goal

Have a containerized development environment for projects that use some of the most common libraries and utilities for AI, ML, etc. This makes possible to have an independent and upgradable environment that can be shared between projects or server.

This is an initial point that may be adapted to specific needs. Check the Dockerfile to see what will be installed. By default installs GPU support for most libraries as it’s based on GPU enabled Tensorflow image. Packages for R are commented but you can uncomment them.

New software installed in the container will be lost as it runs as a ephemeral container. If you will use it, add to the Dockerfile and rebuild.

Usage

Build the docker image docker build -t ai:latest .
Create or move to you project directory. You may want to create any directory used by the container. It will create them but will be owned by root.
Run the environment:
- Use ./denv.sh to run a shell in the environment
- Use ./denv_jupyter.sh to run a jupyter notebook server within the environment

Month: June 2021

sts_eval: Easy Evaluation of Semantic Textual Similarity for Neural Language Models