ci: Automated build push

RelevanceAI · Nov 14, 2020 · 661f83a · 661f83a
1 parent 6c383f1
commit 661f83a
Show file tree

Hide file tree

Showing 28 changed files with 14 additions and 14 deletions.
diff --git a/docs/.doctrees/bi_encoders.text_text.dpr2vec.doctree b/docs/.doctrees/bi_encoders.text_text.dpr2vec.doctree
diff --git a/docs/.doctrees/bi_encoders.text_text.lareqa_qa2vec.doctree b/docs/.doctrees/bi_encoders.text_text.lareqa_qa2vec.doctree
diff --git a/docs/.doctrees/bi_encoders.text_text.use_qa2vec.doctree b/docs/.doctrees/bi_encoders.text_text.use_qa2vec.doctree
diff --git a/docs/.doctrees/encoders.audio.speech_embedding2vec.doctree b/docs/.doctrees/encoders.audio.speech_embedding2vec.doctree
diff --git a/docs/.doctrees/encoders.audio.trill2vec.doctree b/docs/.doctrees/encoders.audio.trill2vec.doctree
diff --git a/docs/.doctrees/encoders.audio.vggish2vec.doctree b/docs/.doctrees/encoders.audio.vggish2vec.doctree
diff --git a/docs/.doctrees/encoders.audio.yamnet2vec.doctree b/docs/.doctrees/encoders.audio.yamnet2vec.doctree
diff --git a/docs/.doctrees/encoders.image.bit2vec.doctree b/docs/.doctrees/encoders.image.bit2vec.doctree
diff --git a/docs/.doctrees/encoders.image.inception_resnet2vec.doctree b/docs/.doctrees/encoders.image.inception_resnet2vec.doctree
diff --git a/docs/.doctrees/encoders.image.mobilenet2vec.doctree b/docs/.doctrees/encoders.image.mobilenet2vec.doctree
diff --git a/docs/.doctrees/encoders.image.resnet2vec.doctree b/docs/.doctrees/encoders.image.resnet2vec.doctree
diff --git a/docs/.doctrees/encoders.text.labse2vec.doctree b/docs/.doctrees/encoders.text.labse2vec.doctree
diff --git a/docs/.doctrees/encoders.text.legalbert2vec.doctree b/docs/.doctrees/encoders.text.legalbert2vec.doctree
diff --git a/docs/.doctrees/environment.pickle b/docs/.doctrees/environment.pickle
diff --git a/docs/bi_encoders.text_text.dpr2vec.html b/docs/bi_encoders.text_text.dpr2vec.html
@@ -200,7 +200,7 @@ <h1>DPR2Vec<a class="headerlink" href="#dpr2vec" title="Permalink to this headli
 <div class="section" id="module-vectorhub.bi_encoders.text_text.torch_transformers.dpr">
 <span id="transformers"></span><h2>Transformers<a class="headerlink" href="#module-vectorhub.bi_encoders.text_text.torch_transformers.dpr" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Dense Passage Retrieval</p>
-<p><strong>Vector Length</strong>: 68</p>
+<p><strong>Vector Length</strong>: 768 (default)</p>
 <p><strong>Description</strong>:
 Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/abs/2004.04906">https://arxiv.org/abs/2004.04906</a></p>

diff --git a/docs/bi_encoders.text_text.lareqa_qa2vec.html b/docs/bi_encoders.text_text.lareqa_qa2vec.html
@@ -200,7 +200,7 @@ <h1>LAReQA2Vec<a class="headerlink" href="#lareqa2vec" title="Permalink to this
 <div class="section" id="module-vectorhub.bi_encoders.text_text.tfhub.lareqa_qa">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.bi_encoders.text_text.tfhub.lareqa_qa" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: LAReQA: Language-agnostic answer retrieval from a multilingual pool</p>
-<p><strong>Vector Length</strong>: 512</p>
+<p><strong>Vector Length</strong>: 512 (default)</p>
 <p><strong>Description</strong>:
 We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Unlike previous cross-lingual tasks, LAReQA tests for “strong” cross-lingual alignment, requiring semantically related cross-language pairs to be closer in representation space than unrelated same-language pairs. Building on multilingual BERT (mBERT), we study different strategies for achieving strong alignment. We find that augmenting training data via machine translation is effective, and improves significantly over using mBERT out-of-the-box. Interestingly, the embedding baseline that performs the best on LAReQA falls short of competing baselines on zero-shot variants of our task that only target “weak” alignment. This finding underscores our claim that languageagnostic retrieval is a substantively new kind of cross-lingual evaluation.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/abs/2004.05484">https://arxiv.org/abs/2004.05484</a></p>

diff --git a/docs/bi_encoders.text_text.use_qa2vec.html b/docs/bi_encoders.text_text.use_qa2vec.html
@@ -200,7 +200,7 @@ <h1>USEQA2Vec<a class="headerlink" href="#useqa2vec" title="Permalink to this he
 <div class="section" id="module-vectorhub.bi_encoders.text_text.tfhub.use_qa">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.bi_encoders.text_text.tfhub.use_qa" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Universal Sentence Encoder Question Answering</p>
-<p><strong>Vector Length</strong>: 512</p>
+<p><strong>Vector Length</strong>: 512 (default)</p>
 <p><strong>Description</strong>:
 - Developed by researchers at Google, 2019, v2 [1].
 - It is trained on a variety of data sources and tasks, with the goal of learning text representations that

diff --git a/docs/encoders.audio.speech_embedding2vec.html b/docs/encoders.audio.speech_embedding2vec.html
@@ -200,7 +200,7 @@ <h1>SpeechEmbedding2Vec<a class="headerlink" href="#speechembedding2vec" title="
 <div class="section" id="module-vectorhub.encoders.audio.tfhub.speech_embedding">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.audio.tfhub.speech_embedding" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Speech Embedding</p>
-<p><strong>Vector Length</strong>: 96</p>
+<p><strong>Vector Length</strong>: 96 (default)</p>
 <p><strong>Description</strong>:
 With the rise of low power speech-enabled devices, there is a growing demand to quickly produce models for recognizing arbitrary
 sets of keywords. As with many machine learning tasks, one of the most challenging parts in the model creation process is obtaining

diff --git a/docs/encoders.audio.trill2vec.html b/docs/encoders.audio.trill2vec.html
@@ -200,7 +200,7 @@ <h1>Trill2Vec<a class="headerlink" href="#trill2vec" title="Permalink to this he
 <div class="section" id="module-vectorhub.encoders.audio.tfhub.trill">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.audio.tfhub.trill" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Trill - Triplet Loss Network</p>
-<p><strong>Vector Length</strong>: 512</p>
+<p><strong>Vector Length</strong>: 512 (default)</p>
 <p><strong>Description</strong>:
 The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for
 different datasets or tasks. The visual and language communities have established benchmarks to compare embeddings, but the speech

diff --git a/docs/encoders.audio.vggish2vec.html b/docs/encoders.audio.vggish2vec.html
@@ -200,7 +200,7 @@ <h1>Vggish2Vec<a class="headerlink" href="#vggish2vec" title="Permalink to this
 <div class="section" id="module-vectorhub.encoders.audio.tfhub.vggish">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.audio.tfhub.vggish" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: VGGish</p>
-<p><strong>Vector Length</strong>: 512</p>
+<p><strong>Vector Length</strong>: 512 (default)</p>
 <p><strong>Description</strong>:
 An audio event embedding model trained on the YouTube-8M dataset.
 VGGish should be used:

diff --git a/docs/encoders.audio.yamnet2vec.html b/docs/encoders.audio.yamnet2vec.html
@@ -200,7 +200,7 @@ <h1>Yamnet2Vec<a class="headerlink" href="#yamnet2vec" title="Permalink to this
 <div class="section" id="module-vectorhub.encoders.audio.tfhub.yamnet">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.audio.tfhub.yamnet" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Yamnet</p>
-<p><strong>Vector Length</strong>: 1024</p>
+<p><strong>Vector Length</strong>: 1024 (default)</p>
 <p><strong>Description</strong>:
 YAMNet is an audio event classifier that takes audio waveform as input and makes independent predictions for each
 of 521 audio events from the AudioSet ontology. The model uses the MobileNet v1 architecture and was trained using

diff --git a/docs/encoders.image.bit2vec.html b/docs/encoders.image.bit2vec.html
@@ -200,7 +200,7 @@ <h1>Bit2Vec<a class="headerlink" href="#bit2vec" title="Permalink to this headli
 <div class="section" id="module-vectorhub.encoders.image.tfhub.bit">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.image.tfhub.bit" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: BiT - Big Transfer, General Visual Representation Learning (Small)</p>
-<p><strong>Vector Length</strong>: 2048</p>
+<p><strong>Vector Length</strong>: 2048 (default)</p>
 <p><strong>Description</strong>:
 Transfer of pre-trained representations improves sample efficiency and simplifies hyperparameter tuning when training
 deep neural networks for vision. We revisit the paradigm of pre-training on large supervised datasets and fine-tuning the model

diff --git a/docs/encoders.image.inception_resnet2vec.html b/docs/encoders.image.inception_resnet2vec.html
@@ -200,7 +200,7 @@ <h1>InceptionResnet2Vec<a class="headerlink" href="#inceptionresnet2vec" title="
 <div class="section" id="module-vectorhub.encoders.image.tfhub.inception_resnet">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.image.tfhub.inception_resnet" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Inception Resnet</p>
-<p><strong>Vector Length</strong>: 1536</p>
+<p><strong>Vector Length</strong>: 1536 (default)</p>
 <p><strong>Description</strong>:
 Very deep convolutional networks have been central to the largest advances in image recognition performance in
 recent years. One example is the Inception architecture that has been shown to achieve very good performance at

diff --git a/docs/encoders.image.mobilenet2vec.html b/docs/encoders.image.mobilenet2vec.html
@@ -200,7 +200,7 @@ <h1>MobileNet2Vec<a class="headerlink" href="#mobilenet2vec" title="Permalink to
 <div class="section" id="module-vectorhub.encoders.image.tfhub.mobilenet">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.image.tfhub.mobilenet" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: MobileNet</p>
-<p><strong>Vector Length</strong>: 1024</p>
+<p><strong>Vector Length</strong>: 1024 (default)</p>
 <p><strong>Description</strong>:
 We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. We introduce two simple global hyper-parameters that efficiently trade off between latency and accuracy. These hyper-parameters allow the model builder to choose the right sized model for their application based on the constraints of the problem. We present extensive experiments on resource and accuracy tradeoffs and show strong performance compared to other popular models on ImageNet classification. We then demonstrate the effectiveness of MobileNets across a wide range of applications and use cases including object detection, finegrain classification, face attributes and large scale geo-localization.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/abs/1704.04861">https://arxiv.org/abs/1704.04861</a></p>

diff --git a/docs/encoders.image.resnet2vec.html b/docs/encoders.image.resnet2vec.html
@@ -200,7 +200,7 @@ <h1>ResNet2Vec<a class="headerlink" href="#resnet2vec" title="Permalink to this
 <div class="section" id="module-vectorhub.encoders.image.tfhub.resnet">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.image.tfhub.resnet" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: ResNet</p>
-<p><strong>Vector Length</strong>: 2048</p>
+<p><strong>Vector Length</strong>: 2048 (default)</p>
 <p><strong>Description</strong>:
 Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/abs/1512.03385">https://arxiv.org/abs/1512.03385</a></p>

diff --git a/docs/encoders.text.labse2vec.html b/docs/encoders.text.labse2vec.html
@@ -200,7 +200,7 @@ <h1>LaBSE2Vec<a class="headerlink" href="#labse2vec" title="Permalink to this he
 <div class="section" id="module-vectorhub.encoders.text.tfhub.labse">
 <span id="tfhub"></span><h2>TFHub<a class="headerlink" href="#module-vectorhub.encoders.text.tfhub.labse" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: LaBSE - Language-agnostic BERT Sentence Embedding</p>
-<p><strong>Vector Length</strong>: 768</p>
+<p><strong>Vector Length</strong>: 768 (default)</p>
 <p><strong>Description</strong>:
 The language-agnostic BERT sentence embedding encodes text into high dimensional vectors. The model is trained and optimized to produce similar representations exclusively for bilingual sentence pairs that are translations of each other. So it can be used for mining for translations of a sentence in a larger corpus.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/pdf/2007.01852v1.pdf">https://arxiv.org/pdf/2007.01852v1.pdf</a></p>

diff --git a/docs/encoders.text.legalbert2vec.html b/docs/encoders.text.legalbert2vec.html
@@ -200,7 +200,7 @@ <h1>LegalBert2Vec<a class="headerlink" href="#legalbert2vec" title="Permalink to
 <div class="section" id="module-vectorhub.encoders.text.torch_transformers.legal_bert">
 <span id="transformers"></span><h2>Transformers<a class="headerlink" href="#module-vectorhub.encoders.text.torch_transformers.legal_bert" title="Permalink to this headline">¶</a></h2>
 <p><strong>Model Name</strong>: Legal Bert</p>
-<p><strong>Vector Length</strong>: 768</p>
+<p><strong>Vector Length</strong>: 768 (default)</p>
 <p><strong>Description</strong>:
 BERT has achieved impressive performance in several NLP tasks. However, there has been limited investigation on its adaptation guidelines in specialised domains. Here we focus on the legal domain, where we explore several approaches for applying BERT models to downstream legal tasks, evaluating on multiple datasets. Our findings indicate that the previous guidelines for pre-training and fine-tuning, often blindly followed, do not always generalize well in the legal domain. Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains. These are: (a) use the original BERT out of the box, (b) adapt BERT by additional pre-training on domain-specific corpora, and (c) pre-train BERT from scratch on domain-specific corpora. We also propose a broader hyper-parameter search space when fine-tuning for downstream tasks and we release LEGAL-BERT, a family of BERT models intended to assist legal NLP research, computational law, and legal technology applications.</p>
 <p><strong>Paper</strong>: <a class="reference external" href="https://arxiv.org/abs/2010.02559">https://arxiv.org/abs/2010.02559</a></p>

diff --git a/docs/searchindex.js b/docs/searchindex.js