Hallo zusammen,
ich hoffe ich bin hier richtig. Mir ist es gestern gelungen, Paperless auf PostgreSQL 18 zu ziehen und seit dem zeigt mir die Paperless Integration folgende Meldung
Kann mir jemand erklären, was ich tun kann?
Hallo zusammen,
ich hoffe ich bin hier richtig. Mir ist es gestern gelungen, Paperless auf PostgreSQL 18 zu ziehen und seit dem zeigt mir die Paperless Integration folgende Meldung
Kann mir jemand erklären, was ich tun kann?
Schau am besten mal in die Logs von paperless ngx - da müsste mehr Informationen stehen ![]()
Da sehe ich eine rote und eine gelbe Meldung
[2025-11-01 12:45:14,378] [INFO] [ocrmypdf._pipeline] skipping all processing on this page
[2025-11-01 12:45:14,384] [INFO] [ocrmypdf._pipelines.ocr] Postprocessing...
[2025-11-01 12:45:14,574] [ERROR] [ocrmypdf._exec.ghostscript] GPL Ghostscript 10.03.1 (2024-05-02)
Copyright (C) 2024 Artifex Software, Inc. All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 5.
Page 1
warning: ignoring zlib error: incorrect data check
warning: ignoring zlib error: incorrect data check
warning: ignoring zlib error: incorrect data check
warning: ignoring zlib error: incorrect data check
GPL Ghostscript 10.03.1:
Detected SMask which must be in DeviceGray, but we are not converting to DeviceGray, reverting to normal PDF output
Page 2
Page 3
Page 4
Page 5
[2025-11-01 12:45:14,706] [INFO] [ocrmypdf._pipeline] Image optimization ratio: 1.00 savings: 0.1%
[2025-11-01 12:45:14,706] [INFO] [ocrmypdf._pipeline] Total file size ratio: 1.53 savings: 34.5%
[2025-11-01 12:45:14,708] [WARNING] [ocrmypdf._pipelines._common] Output file is a valid PDF, but conversion to PDF/A did not succeed (issue: No PDF/A metadata in XMP)
[2025-11-01 12:45:14,709] [DEBUG] [paperless.parsing.tesseract] Incomplete sidecar file: discarding.
[2025-11-01 12:45:14,803] [INFO] [paperless.parsing.tesseract] pdftotext exited 0
Leider sagt mir das nichts.
Nachdem ich im Systemstatus mal die Option gestartet hatte ist er jetzt auf Fehler gesprungen und im Log steht folgendes
[2025-11-01 13:06:35,320] [INFO] [paperless.tasks] Saving updated classifier model to /usr/src/paperless/data/classification_model.pickle...
[2025-11-01 13:31:53,322] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #4875, titled ELAC VELA FS 407
[2025-11-01 13:31:53,323] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:31:53,323] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #2580, titled Citizen Uhr Seriennummer
[2025-11-01 13:31:53,324] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:31:53,324] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #2620, titled ELAC SUB3030 Seriennummer
[2025-11-01 13:31:53,325] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:31:53,325] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #5548, titled Antrag Glasfaseranschluss
[2025-11-01 13:31:53,326] [ERROR] [paperless.sanity_checker] Checksum mismatch of archived document. Stored: 281976575fab104a137122e67e0ae4d8, actual: 6f6e157f8004651b412a94a71a83c7a8.
[2025-11-01 13:36:23,141] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #4875, titled ELAC VELA FS 407
[2025-11-01 13:36:23,142] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:36:23,142] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #2580, titled Citizen Uhr Seriennummer
[2025-11-01 13:36:23,143] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:36:23,143] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #2620, titled ELAC SUB3030 Seriennummer
[2025-11-01 13:36:23,144] [INFO] [paperless.sanity_checker] Document contains no OCR data
[2025-11-01 13:36:23,144] [INFO] [paperless.sanity_checker] Detected following issue(s) with document #5548, titled Antrag Glasfaseranschluss
[2025-11-01 13:36:23,145] [ERROR] [paperless.sanity_checker] Checksum mismatch of archived document. Stored: 281976575fab104a137122e67e0ae4d8, actual: 6f6e157f8004651b412a94a71a83c7a8.
Ich hatte mal gelesen, dass bei einem Versionswechsel von PostgrSQL der Index nicht mehr stimmt. Hier wurde empfohlen, immer bei der laufenden Haptversion zu bleiben bzw. die Daten zu exportieren und wieder zu importieren, um den Index neu aufzubauen.
Bei Docker also die Version von PostgrSQL festzubacken und dort NICHT Latest einzutragen.
Ich habe aus diesem Grund die Datenbank auf MariaDB festgelegt. Bisher wurde jeder Versionswechsel ohne Störung mitgemacht.
Genauso habe ich es ja auch gemacht. Ich hatte nur erst Probleme, da sich der Pfad der Datenbank scheinbar verschoben hat und dann war da noch ein Umlautproblem. Aber beides konnte ich Dank KI und dem Netz lösen. Danach lief es eigentlich fehlerfrei durch, bis ich heute die Anzeige in der Integration gesehen habe.
Mit MariaDB hättest du diese Probleme nicht.
Hier mal meine docker-compose.yaml:
# docker compose file for running paperless from the Docker Hub.
# This file contains everything paperless needs to run.
# Paperless supports amd64, arm and arm64 hardware.
#
# All compose files of paperless configure paperless in the following way:
#
# - Paperless is (re)started on system boot, if it was running before shutdown.
# - Docker volumes for storing data are managed by Docker.
# - Folders for importing and exporting files are created in the same directory
# as this file and mounted to the correct folders inside the container.
# - Paperless listens on port 8000.
#
# In addition to that, this Docker Compose file adds the following optional
# configurations:
#
# - Instead of SQLite (default), MariaDB is used as the database server.
# - Apache Tika and Gotenberg servers are started with paperless and paperless
# is configured to use these services. These provide support for consuming
# Office documents (Word, Excel, Power Point and their LibreOffice counter-
# parts.
#
# To install and update paperless with this file, do the following:
#
# - Copy this file as 'docker-compose.yml' and the files 'docker-compose.env'
# and '.env' into a folder.
# - Run 'docker compose pull'.
# - Run 'docker compose run --rm webserver createsuperuser' to create a user.
# - Run 'docker compose up -d'.
#
# For more extensive installation and update instructions, refer to the
# documentation.
services:
broker:
image: docker.io/library/redis:7
restart: unless-stopped
volumes:
- redisdata:/data
db:
image: docker.io/library/mariadb:11
restart: unless-stopped
volumes:
- dbdata:/var/lib/mysql
environment:
MARIADB_HOST: paperless
MARIADB_DATABASE: paperless
MARIADB_USER: paperless
MARIADB_PASSWORD: paperless
MARIADB_ROOT_PASSWORD: paperless
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
depends_on:
- db
- broker
- gotenberg
- tika
ports:
- "8000:8000"
volumes:
- data:/usr/src/paperless/data
- media:/usr/src/paperless/media
- ./export:/usr/src/paperless/export
- ./consume:/usr/src/paperless/consume
env_file: docker-compose.env
environment:
PAPERLESS_REDIS: redis://broker:6379
PAPERLESS_DBENGINE: mariadb
PAPERLESS_DBHOST: db
PAPERLESS_DBUSER: paperless # only needed if non-default username
PAPERLESS_DBPASS: paperless # only needed if non-default password
PAPERLESS_DBPORT: 3306
PAPERLESS_TIKA_ENABLED: 1
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT: http://tika:9998
gotenberg:
image: docker.io/gotenberg/gotenberg:8.7
restart: unless-stopped
# The gotenberg chromium route is used to convert .eml files. We do not
# want to allow external content like tracking pixels or even javascript.
command:
- "gotenberg"
- "--chromium-disable-javascript=true"
- "--chromium-allow-list=file:///tmp/.*"
tika:
image: docker.io/apache/tika:latest
restart: unless-stopped
volumes:
data:
media:
dbdata:
redisdata:
Hier die docker-compose.env:
# The UID and GID of the user used to run paperless in the container. Set this
# to your UID and GID on the host so that you have write access to the
# consumption directory.
#USERMAP_UID=1000
#USERMAP_GID=1000
# Additional languages to install for text recognition, separated by a
# whitespace. Note that this is
# different from PAPERLESS_OCR_LANGUAGE (default=eng), which defines the
# language used for OCR.
# The container installs English, German, Italian, Spanish and French by
# default.
# See https://packages.debian.org/search?keywords=tesseract-ocr-&searchon=names&suite=buster
# for available languages.
#PAPERLESS_OCR_LANGUAGES=tur ces
###############################################################################
# Paperless-specific settings #
###############################################################################
# All settings defined in the paperless.conf.example can be used here. The
# Docker setup does not use the configuration file.
# A few commonly adjusted settings are provided below.
# This is required if you will be exposing Paperless-ngx on a public domain
# (if doing so please consider security measures such as reverse proxy)
#PAPERLESS_URL=https://paperless.example.com
# Adjust this key if you plan to make paperless available publicly. It should
# be a very long sequence of random characters. You don't need to remember it.
#PAPERLESS_SECRET_KEY=change-me
# Use this variable to set a timezone for the Paperless Docker containers. If not specified, defaults to UTC.
#PAPERLESS_TIME_ZONE=America/Los_Angeles
# The default language to use for OCR. Set this to the language most of your
# documents are written in.
#PAPERLESS_OCR_LANGUAGE=eng
# Set if accessing paperless via a domain subpath e.g. https://domain.com/PATHPREFIX and using a reverse-proxy like traefik or nginx
#PAPERLESS_FORCE_SCRIPT_NAME=/PATHPREFIX
#PAPERLESS_STATIC_URL=/PATHPREFIX/static/ # trailing slash required
es gibt inzwischen 12. Wie machst Du da das Update?
Sollte aktiv und auf deu stehen
Man braucht ja nicht immer die neuste Version. Gerade bei einer DB ist das erstmal nicht so wichtig. Ansonsten passe die compose file doch einfach an ![]()
Hier gilt das gleiche. Passe es einfach für deine Bedürfnisse an.
Ich habe die OCR-Sprache in der Konfiguration auf DEU gestellt. Kann ich aber auf auf Deutsch ändern.
Ich mache mal ein Snapshot von meinem Server und stelle Mariadb auf 12 um.
Das Ergebnis poste ich dann hier.
Die Frage war, „Wie machst Du da das Update“ ![]()
In der compose-yaml setze ich hinter Maria Latest.
Dann
docker compose down
docker compose pull
docker compose up -d
Meintest du diesen Hinweis oder habe ich dich falsch verstanden?
Habe ich schon verstanden ![]()
Habe es hier doch auch erklärt:
Wenn du es genauer brauchst, hat @MartyBr das gut beschrieben.
Ja, jetzt habe ich es verstanden. Danke Euch beiden!
Ich habe jetzt MariaDB auf „latest“ gesetzt und von 11 auf 12 upgedatet. Mein Paperless ist ohne Murren gestartet und alle Daten sind da.
Der Indes ist gleich, daher funktioniert das bei MariaDB. Bei PostgrSQL hast du einen Riesen Aufwand. Entweder die Version „festtackern“ oder export und Import machen.
So, die Installation mit mariadb hat reibungslos mit dem Script geklappt.
Nun wollte ich den Export importierne und da kommen Fehler.
Evtl. könnt Ihr mir da auch helfen:
paperless@paperless2:~/paperless-ngx$ docker compose exec webserver document_importer ../export
Found existing user(s), this might indicate a non-empty installation
Checking the manifest
Database import failed
No version information present
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 105, in _execute
return self.cursor.execute(sql, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 76, in execute
return self.cursor.execute(query, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 179, in execute
res = self._query(mogrified_query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 331, in _query
self._do_get_result(db)
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 136, in _do_get_result
self._result = result = self._get_result()
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 363, in _get_result
return self._get_db().store_result()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
MySQLdb.IntegrityError: (1062, "Duplicate entry 'Diverse/2025/04/2025-04-13 Rechnung-Quittung & JULIA - DAS MU...' for key 'archive_filename'")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/src/paperless/src/manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
utility.execute()
File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 436, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 416, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 460, in execute
output = self.handle(*args, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/src/paperless/src/documents/management/commands/document_importer.py", line 246, in handle
self._run_import()
File "/usr/src/paperless/src/documents/management/commands/document_importer.py", line 288, in _run_import
self.load_data_to_database()
File "/usr/src/paperless/src/documents/management/commands/document_importer.py", line 226, in load_data_to_database
raise e
File "/usr/src/paperless/src/documents/management/commands/document_importer.py", line 207, in load_data_to_database
call_command("loaddata", manifest_path)
File "/usr/local/lib/python3.12/site-packages/django/core/management/__init__.py", line 194, in call_command
return command.execute(*args, **defaults)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/core/management/base.py", line 460, in execute
output = self.handle(*args, **options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/core/management/commands/loaddata.py", line 103, in handle
self.loaddata(fixture_labels)
File "/usr/local/lib/python3.12/site-packages/django/core/management/commands/loaddata.py", line 164, in loaddata
self.load_label(fixture_label)
File "/usr/local/lib/python3.12/site-packages/django/core/management/commands/loaddata.py", line 254, in load_label
if self.save_obj(obj):
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/core/management/commands/loaddata.py", line 210, in save_obj
obj.save(using=self.using)
File "/usr/local/lib/python3.12/site-packages/django/core/serializers/base.py", line 265, in save
models.Model.save_base(self.object, using=using, raw=True, **kwargs)
File "/usr/local/lib/python3.12/site-packages/django/db/models/base.py", line 1008, in save_base
updated = self._save_table(
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/models/base.py", line 1169, in _save_table
results = self._do_insert(
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/models/base.py", line 1210, in _do_insert
return manager._insert(
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/models/query.py", line 1868, in _insert
return query.get_compiler(using=using).execute_sql(returning_fields)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/models/sql/compiler.py", line 1882, in execute_sql
cursor.execute(sql, params)
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 79, in execute
return self._execute_with_wrappers(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 92, in _execute_with_wrappers
return executor(sql, params, many, context)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 100, in _execute
with self.db.wrap_database_errors:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/utils.py", line 91, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 105, in _execute
return self.cursor.execute(sql, params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/django/db/backends/mysql/base.py", line 76, in execute
return self.cursor.execute(query, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 179, in execute
res = self._query(mogrified_query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 331, in _query
self._do_get_result(db)
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 136, in _do_get_result
self._result = result = self._get_result()
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/MySQLdb/cursors.py", line 363, in _get_result
return self._get_db().store_result()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django.db.utils.IntegrityError: Problem installing fixture '/usr/src/paperless/export/manifest.json': Could not load documents.Document(pk=5419): (1062, "Duplicate entry 'Diverse/2025/04/2025-04-13 Rechnung-Quittung & JULIA - DAS MU...' for key 'archive_filename'")
paperless@paperless2:~/paperless-ngx$
Das ist meine yml
services:
broker:
image: docker.io/library/redis:8
restart: unless-stopped
volumes:
- /data/paperless/redisdata:/data
db:
image: docker.io/library/mariadb:latest
restart: unless-stopped
volumes:
- /data/paperless/dbdata:/var/lib/mysql
environment:
MARIADB_HOST: paperless
MARIADB_DATABASE: paperless
MARIADB_USER: paperless
MARIADB_PASSWORD: paperless
MARIADB_ROOT_PASSWORD: paperless
webserver:
image: ghcr.io/paperless-ngx/paperless-ngx:latest
restart: unless-stopped
depends_on:
- db
- broker
- gotenberg
- tika
ports:
- "8000:8000"
volumes:
- /data/paperless/data:/usr/src/paperless/data
- /data/paperless/media:/usr/src/paperless/media
- /data/paperless/export:/usr/src/paperless/export
- /data/paperless/consume:/usr/src/paperless/consume
env_file: docker-compose.env
environment:
PAPERLESS_REDIS: redis://broker:6379
PAPERLESS_DBENGINE: mariadb
PAPERLESS_DBHOST: db
PAPERLESS_DBUSER: paperless # only needed if non-default username
PAPERLESS_DBPASS: paperless # only needed if non-default password
PAPERLESS_DBPORT: 3306
PAPERLESS_TIKA_ENABLED: 1
PAPERLESS_TIKA_GOTENBERG_ENDPOINT: http://gotenberg:3000
PAPERLESS_TIKA_ENDPOINT: http://tika:9998
gotenberg:
image: docker.io/gotenberg/gotenberg:8.24
restart: unless-stopped
# The gotenberg chromium route is used to convert .eml files. We do not
# want to allow external content like tracking pixels or even javascript.
command:
- "gotenberg"
- "--chromium-disable-javascript=true"
- "--chromium-allow-list=file:///tmp/.*"
tika:
image: docker.io/apache/tika:latest
restart: unless-stopped
volumes:
redisdata:
Irgendetwas scheint mit der File nicht zu stimmen.
Was genau versuchst du zu exportieren?
Es passiert beim Import in Paperless mit mariadb
Ich habe die Dateien im ZIP mal gesucht und folgendes gefunden
Da scheint es ein Problem mit der Unterscheidung zwischen Klein- und Großbuchstabden zu geben.