This virtual conference page is based on MiniConf by Alexander Rush and Hendrik Strobelt. It was extended by the amazing team of ACL 2020 and amazing team of ACL 2023. It was adapted for EMNLP 2023 by the virtual infrastructure committee.
The website is based on Flask and Frozen-Flask. It:
- Parses conference-specific input data (e.g., TSV files downloaded from Google Sheets) into a common format.
- From common format data, generate a full static page site, which can be deployed easily.
- Install python poetry
Tips: The official installation process may not be suitable for all network situations.
It is recommended to proceed directly in a Python=3.10 environment by: pip install poetry
- Run:
poetry install
Tips: During this step, it cannot be installed directly, and this variable needs to be introduced first:
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
- Run:
make run
- Visit
localhost:7777
When you are ready to deploy run make freeze
to get a static version of the site in the build
folder.
This will not pickup any program changes, but it will pickup source code changes on the next GH Pages deploy.
We are very welcoming of community contributions!
- Bug Reports & Feature Requests: Open an issue giving reproducible steps
- Contributions:
- Clone the repo
- Run "Quick Start" steps above and check site works
- Make changes
- Ensure
make freeze
runs without errors (warnings are fine) - Make a pull request.
- If the changes look good to us and the site builds on GH Actions, we will merge and your changes will be live for everyone to see!
We did the following steps to adopt the website for EMNLP 2023 Miniconf:
- Change the global configs (configs/site.yaml)
- Add the picture the website need (static/images/emnlp2023/)
- Modify the front-pages (templates/)
- Create the EMNLP 2023 website data dir (data/emnlp_2023/)
We did the following steps to change from the ACL 2023 Miniconf to EMNLP 2023 Miniconf:
- Change the variable name in the code (eg."acl"->"emnlp").
- Add images we need in
static/images
. - Update urls, url list locate in
update.md
. - Download conference data from google, process file
data/emnlp_2023/data/EMNLP Event Program & Details.xlsx
to generate filesoral-papers.tsv
andposter-demo-papers.tsv
, process filedata/emnlp_2023/data/EMNLP 2023_Presenter Info Schedule.xlsx
to generate fileinput.xlsx
. - Run
python emnlp_miniconf/import_emnlp2023.py
to generateconference.json
, do not forget to change the sys.path inemnlp_miniconf/import_emnlp2023.py
. - Write python files
generate_BoF_display.py
,write_event_to_booklet.py
,generate_booklet_data_json.py
,generate_break_display.py
,generate_coffee_break_json.py
,generate_workshops_yaml.py
to preprocess fileinput.xlsx
. Write python fileswrite_BoF_display_to_conference.py
,write_break_display_to_conference.py
,write_break_to_conference.py
,write_keynotes_to_conference.py
,write_social_event_display_to_confenence.py
to add data toconference.json
.Just need to runsh pipeline.sh
to get newconference.json
.(The files default indata/emnlp_2023/data
if not declare) - Run
make run
.
EMNLP 2023 conference data is generated by one step if you have prepared data:
- Data is located in
data/emnlp_2023/data
and should be downloaded from the (private) conference google spreadsheets into the file:input.xlsx
,oral-papers.tsv
andposter-demo-papers.tsv
. - Run
sh pipeline.sh
to generate data toconference.json
. (all same content)
ACL 2023 Miniconf is generated in three steps:
- Data is located in
data/acl_2023/data
and should be downloaded from the (private) conference google spreadsheets into these files:oral-papers.tsv
,poster-demo-papers.tsv
,spotlight-papers.tsv
, andvirtual-papers.tsv
. There are additional files in the same google drive, make sure to download all of them to that folder. - Run
python acl_miniconf/import_acl2023.py
to generate data toauto_data/acl_2023/
, inconference.json
,conference.pkl
, andconference.yaml
(all same content). - Run
make run
- Before pushing, run
make freeze
and check it works
Aggregated workshop paper proceedings are stored as public source data in data/acl_2023/data
which are subsequently parsed by the import_acl2023.py
script. The file data/acl_2023/data/workshop_papers.yml
can be generated by running:
- Running
scripts/clone_workshops.sh
- Running
python acl_miniconf/import_acl2023_workshop_papers.py
Following this, be sure to run the script to remake conference.json
, python acl_miniconf/import_acl2023.py
- Obtain and incorporate info beyond paper program
- Obtain/incorporate underline links for papers/events
The repository consists of the following main components:
- Datastore sitedata
Collection of data files representing the papers, speakers, workshops, and other important information for the conference.
- Routing main.py
This file contains defines the Flask app and the routes
- Templates templates
Contains all the pages for the site. See base.html
for the master page and components.html
for core components.
- Frontend static
Contains frontend components like the default css, images, and JavaCcript libs.
- Scripts scripts
Contains additional preprocessing to add visualizations, recommendations, schedules to the conference.
acl_miniconf
uses a shared data format that current and future conferences should use defined inacl_miniconf.data.Conference
.acl_miniconf/import_acl2023.py
: Parses program information from spreadsheets/underline to generate a singleacl_miniconf.data.Conference
object, which is converted to JSON for the frontend and YAML for human inspection.- The static components of the website consume the JSON output to generate the website.
The main reasons for this architecture are:
- To abstract the data parsing from the data format.
- Keep a static format for the frontend.
- Move conference specific processing out from shared library code.
Once you have a RocketChat instance, be sure to:
- How to get API key: https://docs.rocket.chat/use-rocket.chat/user-guides/user-panel/my-account
- Copy the template in
configs/rocketchat/template.yaml
to another file in the same directory. Add your user/api keys here, but do not commit it! - Follow these instructions and ensure that getting all paginated results is allowed: https://developer.rocket.chat/reference/api/rest-api#pagination
Running the CLI, from root directory: python -m acl_miniconf.rocketchat.cli --config-name rc-pedro command=add_emojis
This section describes all pages that are in this version of MiniConf and how to customize them.
This page is mainly configured via sitedata/config.yml
. One mainly needs to change the conference name, date,
number of workshops/tutorials and help documents here. Before the conference starts, also update the
acknowledgements. In static/js/time-extend.js
, you also need to change start and end times of the conference to
make sure that daylight saving is handled correctly.
The schedule is based on FullCalendar. The entries are automatically generated from
the data of the paper sessions/workshops/plenary. For the weekly view, we compute blocks of events and show only a
generic name, on the day view, they are shown as is. Additional events can be added via sitedata/overall_calendar.yml
.
If you add new event types, make sure to assign them a color in load_site_data::build_schedule
.
Plenary make up the official program of the conference. For EMNLP, we had live events and prerecorded keynote talks. Keynotes were prerecorded but livestreamed via SlidesLive. Panels were fully live and used CART real-time captioning. We did not add the prerecorded talks to the plenary page before the talk was done. Each plenary details page can either have one video or a list of videos. It can also have a RocketChat channel. The default is to just show one SlidesLive video. Refer to the ACL2020 repo for how multiple videos can be shown on a plenary details page.
Plenary events were livestreamed via SlidesLive. Make sure that your contract with them covers this. Panels were fully live and used CART real-time captioning.
For EMNLP, we had prerecorded presentations and keynote speakers and live panels. Recordings were streamed live. For keynotes, after the
prerecorded talk was shown, the keynote speaker, a volunteer and a SlidesLive person were in a Zoom call, this Zoom call was
streamed after seamlessly after the keynote recording. Then the volunteer took questions from the #live
RocketChat channel
and asked the keynote speaker.
Before the livestream starts, change the livestream ID and CART URL on livestream.html
. The Livestream SlidesLive IDs
are different from the actual ID for the recording, e.g. if a keynote is livestreamed,
then the livestream ID is different from the prerecorded ID. Make sure to update the ID before the stream start
and ask SlidesLive to show a Livestream will start soon
message. After the livestream is done, add the ID
to the plenary event itself. You can use the livestream ID as a slideslive ID for the player, it will then show
a recording of the livestream.
We use SPECTER to generate document embeddings from abstracts and
DyGIE++ to generate keywords. For the embeddings, we then use umap
to project them to 2D. Recommentations are generated by using n-nearest neighbours.
We provide scripts/dataentry/projections.py
to generate the projections, please
refer to the respective repositories to find out how to install them. Install them in their seperate virtualenv
,
they all have conflicting dependencies.
We extract images from the PDFs and upload them to Amazon S3. You can use our bad script under
scripts/dataentry/extract_images.py
to extract them. In order to allow authors to change their images,
we set up an additional Github repository that
contains the images. Pull requests there automatically deploy the update images to S3. Look at the Github workflow
there and set the respective secrets for the autodeploy.
Connected Papers is a visual tool to help researchers and applied scientists find academic papers relevant to their field of work. In addition to linking to the connected papers for each main paper, they built a custom page for EMNLP which is shown below every main paper presentation. If you want that also, then conference papers need to be indexed by SemanticScholar before the conference. Contact them via their website and ask nicely, then they will also help you.
Tutorials simply contain a schedule of the events, website, Zoom links, RocketChat and optional a prerecorded SlivesLive
video. We asked tutorial organizers to fill in the information into Google Sheets and then use a script to download that
sheet and parse it. You can refer to scripts/dataentry/tutorials.py
and see how we loaded tutorials. We used
this template for
tutorial organizers to fill in. Tutorial blocks on the tutorial overview page and in the main schedule are computed
automatically.
Workshops contain a schedule of the events (can use markdown in there, website, Zoom links, RocketChat, a list of prerecorded
SlivesLive videos for invited talks and a link to workshop papers. We asked workshop organizers to fill in the information
into Google Sheets and then use a script to download that sheet and parse it. You can refer to scripts/dataentry/workshop.py
and see how we loaded workshops. We used
this template for
workshop organizers to fill in. Workshop blocks on the workshop overview page and in the main schedule are computed automatically.
We gave each workshop up to 5 Zoom links, but did not schedule events for them. Instead, we linked the personal meeting ID
on our website.
We asked social organizers to fill in the information into Google Sheets and then use a script to download that sheet and parse
it. You can refer to scripts/dataentry/socials.py
and see how we loaded socials. We used
this template for
social organizers to fill in. Social blocks in the main schedule are computed automatically.
We gave each social event one Zoom link or they could get a Gather room, but did not schedule events for them. Instead, we
linked the personal meeting ID on our website.
We use RocketChat throughout this page to have channels for keynotes, plenaries, live, papers, workshops, tutorials and many more. You can host RocketChat yourself, e.g. on AWS (refer to their guide) or pay them to host it. We normally first create a demo workspace and then upgrade it to full for one month with the number of anticipated users during the conference.
We integrate RocketChat via SSO into our Amazon Cognito user repository so that only one set of username and password is needed. For that, you can refer to this guide. You do not need to change the Lambda functions if you set up the project correctly when creating the AWS app.
You can refer to our checklist for RocketChat to have an idea what needs to be done.
We have scripts for RocketChat setup in scripts\rocketchat
and here.
In order to use the Active Chat
feature, you need to use RocketChat and host the statistics server. Please refer to
scripts/channels-stats
to see how. If you do not want it, remove it from base.html
.
We use VirtualChair to manage Gather.Town for us. We strongly recommend to also book them. Ask them early enough. We use SSO with Gather.Town, you need to specifically ask for it and make sure that it lands in the contract.
We buy one Master account and then create accounts for the specific events. For things like tutorials, socials and workshops we do not schedule events but use the personal meeting link. You can use these scripts to help you creating Zoom things.
We use this website to generate favicons from an image.
MiniConf was built by Hendrik Strobelt and Sasha Rush.
Thanks to Darren Nelson for the original design sketches. Shakir Mohamed, Martha White, Kyunghyun Cho, Lee Campbell, and Adam White for planning and feedback. Hao Fang, Junaid Rahim, Jake Tae, Yasser Souri, Soumya Chatterjee, and Ankshita Gupta for contributions.
It was extended by the ACL 2020 virtual infrastructure team, especially by Hao Fang and Sudha Rao with the help of many volunteers.
It was extended for EMNLP 2020 by Jan-Christoph Klie with the help of the EMNLP 2020 virtual infrastructure team and with the help of many volunteers.