The "garbage in garbage out" status quo remains unchanged despite the fact that LLMs have advanced Natural Language Processing (NLP) significantly. In response, RAGFlow introduces two unique features compared to other Retrieval-Augmented Generation (RAG) products.
- Fine-grained document parsing: Document parsing involves images and tables, with the flexibility for you to intervene as needed.
- Traceable answers with reduced hallucinations: You can trust RAGFlow's responses as you can view the citations and references supporting them.
English, simplified Chinese, traditional Chinese for now.
We put painstaking effort into document pre-processing tasks like layout analysis, table structure recognition, and OCR (Optical Character Recognition) using our vision model. This contributes to the additional time required.
RAGFlow has a number of built-in models for document structure parsing, which account for the additional computational resources.
Currently, we only support x86 CPU and Nvidia GPU.
The corresponding APIs are now available. See the Conversation API for more information.
No, this feature is still in development. Contributions are welcome.
Yes, this feature is now available.
5. Do you support multiple rounds of dialogues, i.e., referencing previous dialogues as context for the current dialogue?
This feature and the related APIs are still in development. Contributions are welcome.
$ git clone https://github.com/infiniflow/ragflow.git
$ cd ragflow
$ docker build -t infiniflow/ragflow:latest .
$ cd ragflow/docker
$ chmod +x ./entrypoint.sh
$ docker compose up -d
- Check your network from within Docker, for example:
curl https://hf-mirror.com
- If your network works fine, the issue lies with the Docker network configuration. Replace the Docker building command:
docker build -t infiniflow/ragflow:vX.Y.Z.
With this:
docker build -t infiniflow/ragflow:vX.Y.Z. --network host
2.1 Cannot access https://huggingface.co
A locally deployed RAGflow downloads OCR and embedding modules from Huggingface website by default. If your machine is unable to access this site, the following error occurs and PDF parsing fails:
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res'
To fix this issue, use https://hf-mirror.com instead:
- Stop all containers and remove all related resources:
cd ragflow/docker/
docker compose down
-
Replace
https://huggingface.co
withhttps://hf-mirror.com
in ragflow/docker/docker-compose.yml. -
Start up the server:
docker compose up -d
This error suggests that you do not have Internet access or are unable to connect to hf-mirror.com. Try the following:
- Manually download the resource files from huggingface.co/InfiniFlow/deepdoc to your local folder ~/deepdoc.
- Add a volumes to docker-compose.yml, for example:
- ~/deepdoc:/ragflow/rag/res/deepdoc
2.3 FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/huggingface/hub/models--InfiniFlow--deepdoc/snapshots/FileNotFoundError: [Errno 2] No such file or directory: '/ragflow/rag/res/deepdoc/ocr.res'be0c1e50eef6047b412d1800aa89aba4d275f997/ocr.res'
- Check your network from within Docker, for example:
curl https://hf-mirror.com
- Run
ifconfig
to check themtu
value. If the server'smtu
is1450
while the NIC'smtu
in the container is1500
, this mismatch may cause network instability. Adjust themtu
policy as follows:
vim docker-compose-base.yml
# Original configuration:
networks:
ragflow:
driver: bridge
# Modified configuration:
networks:
ragflow:
driver: bridge
driver_opts:
com.docker.network.driver.mtu: 1450
Ignore this warning and continue. All system warnings can be ignored.
You will not log in to RAGFlow unless the server is fully initialized. Run docker logs -f ragflow-server
.
The server is successfully initialized, if your system displays the following:
____ ______ __
/ __ \ ____ _ ____ _ / ____// /____ _ __
/ /_/ // __ `// __ `// /_ / // __ \| | /| / /
/ _, _// /_/ // /_/ // __/ / // /_/ /| |/ |/ /
/_/ |_| \__,_/ \__, //_/ /_/ \____/ |__/|__/
/____/
* Running on all addresses (0.0.0.0)
* Running on http://127.0.0.1:9380
* Running on http://x.x.x.x:9380
INFO:werkzeug:Press CTRL+C to quit
dependency failed to start: container ragflow-mysql is unhealthy
means that your MySQL container failed to start. Try replacing mysql:5.7.18
with mariadb:10.5.8
in docker-compose-base.yml.
Ignore this warning and continue. All system warnings can be ignored.
Parsing requests have to wait in queue due to limited server resources. We are currently enhancing our algorithms and increasing computing power.
If your RAGFlow is deployed locally, try the following:
- Click the red cross icon next to Parsing Status and refresh the file parsing process.
- If the issue still persists, try the following:
- check the log of your RAGFlow server to see if it is running properly:
docker logs -f ragflow-server
- Check if the task_executor.py process exists.
- Check if your RAGFlow server can access hf-mirror.com or huggingface.com.
If your RAGFlow is deployed locally, the parsing process is likely killed due to insufficient RAM. Try increasing your memory allocation by increasing the MEM_LIMIT
value in docker/.env.
Ensure that you restart up your RAGFlow server for your changes to take effect!
docker compose stopdocker compose up -d
An index failure usually indicates an unavailable Elasticsearch service.
tail -f path_to_ragflow/docker/ragflow-logs/rag/*.log
$ docker ps
The system displays the following if all your RAGFlow components are running properly:
5bc45806b680 infiniflow/ragflow:latest "./entrypoint.sh" 11 hours ago Up 11 hours 0.0.0.0:80->80/tcp, :::80->80/tcp, 0.0.0.0:443->443/tcp, :::443->443/tcp, 0.0.0.0:9380->9380/tcp, :::9380->9380/tcp ragflow-server
91220e3285dd docker.elastic.co/elasticsearch/elasticsearch:8.11.3 "/bin/tini -- /usr/l…" 11 hours ago Up 11 hours (healthy) 9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp ragflow-es-01
d8c86f06c56b mysql:5.7.18 "docker-entrypoint.s…" 7 days ago Up 16 seconds (healthy) 0.0.0.0:3306->3306/tcp, :::3306->3306/tcp ragflow-mysql
cd29bcb254bc quay.io/minio/minio:RELEASE.2023-12-20T01-00-02Z "/usr/bin/docker-ent…" 2 weeks ago Up 11 hours 0.0.0.0:9001->9001/tcp, :::9001->9001/tcp, 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp ragflow-minio
- Check the status of your Elasticsearch component:
$ docker ps
The status of a 'healthy' Elasticsearch component in your RAGFlow should look as follows:
91220e3285dd docker.elastic.co/elasticsearch/elasticsearch:8.11.3 "/bin/tini -- /usr/l…" 11 hours ago Up 11 hours (healthy) 9300/tcp, 0.0.0.0:9200->9200/tcp, :::9200->9200/tcp ragflow-es-01
-
If your container keeps restarting, ensure
vm.max_map_count
>= 262144 as per this README. Updating thevm.max_map_count
value in /etc/sysctl.conf is required, if you wish to keep your change permanent. This configuration works only for Linux. -
If your issue persists, ensure that the ES host setting is correct:
- If you are running RAGFlow with Docker, it is in docker/service_conf.yml. Set it as follows:
es: hosts: 'http://es01:9200'
- If you run RAGFlow outside of Docker, verify the ES host setting in conf/service_conf.yml using:
curl http://<IP_OF_ES>:<PORT_OF_ES>
This is because you forgot to update the vm.max_map_count
value in /etc/sysctl.conf and your change to this value was reset after a system reboot.
Your IP address or port number may be incorrect. If you are using the default configurations, enter http://<IP_OF_YOUR_MACHINE>
(NOT 9380, AND NO PORT NUMBER REQUIRED!) in your browser. This should work.
A correct Ollama IP address and port is crucial to adding models to Ollama:
- If you are on demo.ragflow.io, ensure that the server hosting Ollama has a publicly accessible IP address.Note that 127.0.0.1 is not a publicly accessible IP address.
- If you deploy RAGFlow locally, ensure that Ollama and RAGFlow are in the same LAN and can comunicate with each other.
Yes, we do. See the Python files under the rag/app folder.
You probably forgot to update the MAX_CONTENT_LENGTH environment variable:
- Add environment variable
MAX_CONTENT_LENGTH
to ragflow/docker/.env:
MAX_CONTENT_LENGTH=100000000
- Update docker-compose.yml:
environment:
- MAX_CONTENT_LENGTH=${MAX_CONTENT_LENGTH}
- Restart the RAGFlow server:
docker compose up ragflow -d
Now you should be able to upload files of sizes less than 100MB.
This exception occurs when starting up the RAGFlow server. Try the following:
- Prolong the sleep time: Go to docker/entrypoint.sh, locate line 26, and replace
sleep 60
withsleep 280
. - If using Windows, ensure that the entrypoint.sh has LF end-lines.
- Go to docker/docker-compose.yml, add the following:
./entrypoint.sh:/ragflow/entrypoint.sh
- Change directory:
cd docker
- Stop the RAGFlow server:
docker compose stop
- Restart up the RAGFlow server:
docker compose up
- Ensure that the RAGFlow server can access the base URL.
- Do not forget to append /v1/ to http://IP:port: http://IP:port/v1/
- Check if the status of your minio container is healthy:
docker ps
- Ensure that the username and password settings of MySQL and MinIO in docker/.env are in line with those in docker/service_conf.yml.
- Right click the desired dialog to display the Chat Configuration window.
- Switch to the Model Setting tab and adjust the Max Tokens slider to get the desired length.
- Click OK to confirm your change.
You limit what the system responds to what you specify in Empty response if nothing is retrieved from your knowledge base. If you do not specify anything in Empty response, you let your LLM improvise, giving it a chance to hallucinate.
You can use Ollama to deploy local LLM. See here for more information.
- If RAGFlow is locally deployed, ensure that your RAGFlow and Ollama are in the same LAN.
- If you are using our online demo, ensure that the IP address of your Ollama server is public and accessible.
- Click Knowledge Base in the middle top of the page.
- Right click the desired knowledge base to display the Configuration dialogue.
- Choose Q&A as the chunk method and click Save to confirm your change.
No, connecting to Redis is not required.
This error occurs because there are too many chunks matching your search criteria. Try reducing the TopN and increasing Similarity threshold to fix this issue:
- Click Chat in the middle top of the page.
- Right click the desired conversation > Edit > Prompt Engine
- Reduce the TopN and/or raise Silimarity threshold.
- Click OK to confirm your changes.
You can upgrade RAGFlow to either the dev version or the latest version:
- Dev versions are for developers and contributors. They are published on a nightly basis and may crash because they are not fully tested. We cannot guarantee their validity and you are at your own risk trying out latest, untested features.
- The latest version refers to the most recent, officially published release. It is stable and works best with regular users.
To upgrade RAGFlow to the dev version:
- Pull the latest source code
cd ragflow git pull
- If you used
docker compose up -d
to start up RAGFlow server:docker pull infiniflow/ragflow:dev
docker compose up ragflow -d
- If you used
docker compose -f docker-compose-CN.yml up -d
to start up RAGFlow server:docker pull swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow:dev
docker compose -f docker-compose-CN.yml up -d
To upgrade RAGFlow to the latest version:
-
Update ragflow/docker/.env as follows:
RAGFLOW_VERSION=latest
-
Pull the latest source code:
cd ragflow git pull
-
If you used
docker compose up -d
to start up RAGFlow server:docker pull infiniflow/ragflow:latest
docker compose up ragflow -d
-
If you used
docker compose -f docker-compose-CN.yml up -d
to start up RAGFlow server:docker pull swr.cn-north-4.myhuaweicloud.com/infiniflow/ragflow:latest
docker compose -f docker-compose-CN.yml up -d