mirror of
https://github.com/TriliumNext/Trilium.git
synced 2025-12-09 19:43:01 -06:00
Open
opened 2025-10-01 16:42:27 -05:00 by giteasync
·
0 comments
No Branch/Tag Specified
main
feat/improve-docs-take1
bugfix/title-color-in-note-tree
stable
feat/rice-searching-with-sqlite
feat/persistant-tray
migrate_pageurl
feat/push-to-wiki-when-docs-change
feat/redo-llm-feature-take2
weblate-trilium-client
kev/share-html
feat/website/improvements
fix/mkdocs-take-2
feature/dx
fix/try-to-fix-swaggerui-api-pages
feat/add-ckeditor-plugin-options
feat/improve-docs
feat/add-websocket-docs
feat/llm-tool-improvement
feat/better-image-viewer
feature/electron-builder
feat/add-ocr-capabilities
feature/fs_sync
renovate/csrf-csrf-4.x
algolia_v1
v0.100.0
v0.99.5
v0.99.4
v0.99.3
v0.99.2
v0.99.1
v0.99.0
v0.98.1
v0.98.0
v0.97.2
v0.97.1
v0.97.0
v0.96.0
v0.95.0
v0.94.1
v0.94.0
v0.93.0
v0.92.7
v0.92.6
v0.92.5-beta
v0.92.4
v0.92.3-beta
v0.92.2-beta
v0.92.1-beta
v0.92.0-beta
v0.91.6
v0.91.5
v0.91.4-rc1
v0.91.4-beta
v0.91.3-beta
v0.91.2-beta
v0.91.1-beta
v0.90.12
v0.90.11-beta
v0.90.10-beta
v0.90.9-beta
v0.90.8
v0.90.7-beta
nightly
v0.90.6-beta
v0.90.5-beta
v0.90.4
v0.90.3
v0.90.2-beta
v0.90.1-beta
v0.90.0-beta
v0.63.7
v0.63.6
v0.63.5
v0.63.4
v0.63.3
v0.63.2-beta
v0.62.6
v0.63.1.1-beta
v0.63.1-beta
v0.62.5
v0.63.0-beta
v0.62.4
v0.62.3
v0.62.2
v0.61.15
v0.61.14
v0.62.1-beta
v0.62.0-beta
v0.61.13
v0.61.12
v0.61.11
v0.61.10-beta
v0.61.9-beta
v0.61.8-beta
v0.61.7-beta
v0.61.6-beta
v0.61.5-beta
v0.61.4-beta
v0.61.3-beta
v0.61.2-beta
v0.61.1-beta
v0.61.0-beta
v0.60.4
v0.60.3
v0.60.2-beta
v0.60.1-beta
v0.60.0-beta
v0.59.4
v0.59.3
v0.59.2
v0.59.1
v0.59.0-beta
v0.58.8
v0.58.7
v0.58.6
v0.58.5
v0.58.4
v0.58.3-beta
v0.58.2-beta
v0.58.1-beta
v0.58.0-beta
v0.57.5
v0.57.4
v0.57.3
v0.57.2
v0.57.1-beta
v0.57.0-beta
v0.56.2
v0.56.1
v0.56.0-beta
v0.55.1
v0.55.0-beta
v0.54.3
v0.54.2
v0.54.1-beta
v0.54.0-beta
v0.53.2
v0.53.1-beta
v0.52.4
v0.53.0-beta
v0.52.3
v0.52.2
v0.52.1-beta
v0.52.0-beta
v0.51.2
v0.51.1-beta
v0.51.0-beta
v0.50.3
v0.50.2
v0.50.1
v0.50.0-beta
v0.49.5
v0.49.4
v0.49.3
v0.49.2-beta
v0.49.1-beta
v0.49.0-beta
v0.48.9
v0.48.8
v0.48.7
v0.48.6-docker
v0.48.6.1
v0.48.6
v0.48.5
v0.48.4
v0.48.3
v0.48.2
v0.48.1-beta
v0.48.0-beta
v0.47.8
v0.47.7
v0.47.6
v0.47.5
v0.47.4
v0.47.3
v0.47.2
v0.46.9
v0.47.1-beta
v0.46.8
v0.47.0-beta
v0.46.7
v0.46.6
v0.46.5
v0.46.4-beta
v0.46.3-beta
v0.46.2-beta
v0.46.1-beta
v0.46.0-beta
v0.45.10
v0.45.9
v0.45.8
v0.45.7
v0.45.6
v0.45.5
v0.45.4
v0.45.3
v0.45.2
v0.45.1
v0.45.0-beta
v0.44.9
v0.44.8
v0.44.7
v0.44.6
v0.44.5
v0.44.4
v0.44.3-beta
v0.44.2-beta
v0.44.1-beta
v0.44.0-beta
v0.43.4
v0.43.3
v0.43.2
v0.43.1
v0.43.0-beta
v0.42.7
v0.42.6
v0.42.5
v0.42.4
v0.42.3
v0.42.2
v0.42.1
v0.42.0-beta
v0.41.6
v0.41.5
v0.41.4-beta
0.41.3
v0.41.3-beta
v0.41.2-beta
v0.41.1-beta
v0.41.0-beta
v0.40.7
v0.40.6
v0.40.5
v0.40.4
v0.40.3
v0.40.2
v0.40.1
v0.39.6
v0.40.0-beta
v0.39.5
v0.39.4
v0.39.3
v0.39.2-beta
v0.39.1-beta
v0.39.0-beta
v0.38.3
v0.38.2
v0.38.1-beta
v0.38.0-beta
v0.37.8
v0.37.7
v0.37.6
v0.37.5
v0.37.4
v0.37.3
v0.37.2
v0.37.1-beta
v0.37.0-beta
v0.36.5
v0.36.4
v0.36.3
v0.36.2
v0.36.1-beta
v0.36.0-beta
v0.35.2
v0.35.1
v0.35.0-beta
v0.34.3
v0.34.2
v0.34.1
v0.34.0-beta
v0.33.7
v0.33.6
v0.33.5
v0.33.4
v0.33.3
v0.33.2-beta
v0.33.1-beta
v0.33.0-beta
v0.32.4
v0.32.3
v0.32.2-beta
v0.32.1-beta
v0.32.0-beta
v0.31.6
v0.31.5
v0.31.4
v0.31.3
v0.31.2-beta
v0.30.8
v0.31.1-beta
v0.31.0-beta
v0.30.7
v0.30.6
v0.30.5
v0.30.4
v0.30.3-beta
v0.30.2-beta
v0.30.1-beta
v0.30.0-beta
v0.29.1
v0.29.0-beta
v0.28.3
v0.28.2
v0.28.1-beta
v0.28.0-beta
v0.27.4
v0.27.3
v0.27.2-beta
v0.27.1-beta
v0.27.0-beta
v0.26.1
v0.26.0-beta
v0.25.2
v0.25.1-beta
v0.25.0-beta
v0.24.5
v0.24.4-beta
v0.24.3-beta
v0.24.2-beta
v0.24.1-beta
v0.24.0-beta
v0.23.1
v0.23.0
v0.22.1
v0.22.0
v0.21.0
v0.20.2
v0.20.1
v0.20.0
v0.19.1
v0.19.0
v0.18.0
v0.17.0
v0.16.0
v0.15.0
v0.14.1
v0.14.0
v0.13.0-beta
v0.12.0
v0.11.1
v0.11.0-beta
v0.10.2-beta
v0.10.1-beta
v0.10.0-beta
v0.9.2
v0.9.1-beta
v0.9.0-beta
v0.8.1
v0.8.0-beta
v0.7.0-beta
v0.6.2
v0.6.1
v0.6.0-beta
v0.5.6
v0.5.5-beta
v0.5.4-beta
v0.5.3-beta
v0.5.2-beta
v0.5.1-beta
v0.5.0-beta
v0.4.1
v0.4.0-beta
v0.3.3
v0.3.2
v0.3.1
v0.3.0
v0.2.2
v0.2.1
v0.2.0
v0.1.2
v0.1.1
v0.1.0
v0.0.11
v0.0.10
v0.0.9
Labels
Clear labels
BE
Difficulty: Easy
Difficulty: Hard
Difficulty: Hard
State: Blocked
State: Outdated
State: Triage
State: Triage
State: Under Debate
Type: Documentation
Type: Scripts & Themes
UI
UI
UI
awaiting feedback
ckeditor
container
dependencies
desktop-app
downstream
effort-high
effort-low
effort-medium
geo-map
llm
mermaid diagrams
mobile
ported-issues
pull-request
Mirrored from GitHub Pull Request
question
refactor
regression
scripting
search
search
sync
upstream
web-clipper
No Label
pull-request
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: TriliumNext/Trilium#5328
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/TriliumNext/Trilium/pull/5834
Author: @perfectra1n
Created: 6/21/2025
Status: 🔄 Open
Base:
main← Head:feat/add-ocr-capabilities📝 Commits (10+)
c4a0219feat(ocr): add unit tests, resolve double sent headers, and fix the wonderful tesseract.js path issues33a5492fix(package): referenced wrong tesseract.js lol864543efeat(ocr): drop confidence down a little bita4adc51fix(unit): resolve typecheck errorsf135622feat(unit): ocr unit tests almost passd20b3d8feat(unit): ocr tests almost pass...80a9182feat(unit): ocr tests almost pass...7868ebefix(unit): also fix broken llm test09196c0fix(ocr): obviously don't need this migration file anymore4b5e8d3Update playwright.yml📊 Changes
40 files changed (+4843 additions, -92 deletions)
View changed files
📝
.github/instructions/nx.instructions.md(+1 -1)📝
.github/workflows/playwright.yml(+0 -1)📝
apps/client/src/components/root_command_executor.ts(+13 -0)📝
apps/client/src/services/content_renderer.ts(+40 -4)📝
apps/client/src/stylesheets/style.css(+23 -0)📝
apps/client/src/translations/en/translation.json(+31 -1)📝
apps/client/src/widgets/buttons/note_actions.ts(+9 -0)📝
apps/client/src/widgets/note_detail.ts(+5 -0)📝
apps/client/src/widgets/type_widgets/options/images/images.ts(+332 -0)➕
apps/client/src/widgets/type_widgets/read_only_ocr_text.ts(+215 -0)📝
apps/client/src/widgets/view_widgets/list_or_grid_view.ts(+2 -1)📝
apps/server/package.json(+6 -1)📝
apps/server/src/assets/db/schema.sql(+2 -0)📝
apps/server/src/becca/entities/bblob.ts(+4 -1)📝
apps/server/src/migrations/migrations.ts(+19 -0)📝
apps/server/src/routes/api/llm.spec.ts(+3 -36)➕
apps/server/src/routes/api/ocr.spec.ts(+75 -0)➕
apps/server/src/routes/api/ocr.ts(+612 -0)📝
apps/server/src/routes/api/options.ts(+7 -1)📝
apps/server/src/routes/routes.ts(+11 -0)...and 20 more files
📄 Description
This PR integrates OCR capabilities by orchestrating interactions between a new client-side UI, a set of server-side API endpoints, a core OCR service, the Tesseract.js library, and the existing database schema.
Key Features:
/api/ocr/process-note/{noteId}: Triggers OCR processing for a specific image note./api/ocr/process-attachment/{attachmentId}: Triggers OCR for a specific image attachment./api/ocr/search: Searches for text within the extracted OCR data./api/ocr/batch-process: Initiates a batch job to process all images that haven't been OCR'd yet./api/ocr/batch-progress: Retrieves the progress of the ongoing batch OCR job./api/ocr/stats: Provides statistics on OCR'd files./api/ocr/delete/{blobId}: Deletes the OCR data for a specific image.ocr_textcolumn. This allowsfor efficient searching of image content.
Implementation Details:
extraction, and database interaction.
JPEG, PNG, GIF, BMP, TIFF, andWEBP.apps/client/src/widgets/type_widgets/options/images/images.tsprovides an interface for managing OCR settings and initiating batch processing.Data Storage and Schema
ocr_text(of typeTEXT), has been added to the existingblobstable. Theblobstable stores the actual file content (the image itself), so this new column adds the extracted text alongside the binary data it was derived from.OCRService.storeOCRResult()method is responsible for persistence. It executes the SQL command:UPDATE blobs SET ocr_text = ? WHERE blobId = ?.OCRService.getStoredOCRResult()method checks if text already exists using:SELECT ocr_text FROM blobs WHERE blobId = ?.OCRService.searchOCRResults()method performs aLIKEquery to find matches:SELECT blobId, ocr_text FROM blobs WHERE ocr_text LIKE ?.Core Logic:
OCRService(apps/server/src/services/ocr/ocr_service.ts)This class contains the primary business logic and orchestrates the entire OCR process.
initialize): The service doesn't initialize Tesseract on application startup. Instead, it's initialized on-demand the first time an OCR operation is requested. It correctly configures the paths for the Tesseract worker (worker-script/node/index.js) and the WebAssembly core (tesseract-core.wasm.js).extractTextFromImage): This is the heart of the process. It takes aBufferof image data, passes it to theTesseract.worker.recognize()function, and awaits the result. It then formats the output into a structuredOCRResultobject, converting Tesseract's confidence score from a 0-100 scale to a 0-1 decimal.processNoteOCR,processAttachmentOCR): These methods act as controllers. They fetch the relevant note or attachment from the database using thebeccaservice, verify its MIME type is a supported image format, and check if OCR text already exists in theblobstable. If all checks pass, they retrieve the image content via.getContent()and pass the resulting buffer toextractTextFromImage. Finally, they persist the result usingstoreOCRResult.startBatchProcessing,processBatchInBackground):this.batchProcessingState. This object tracks the total number of images, the number processed, and the start time. Using in-memory state is efficient for tracking the live progress of a single, ongoing task.processBatchInBackground) runs asynchronously without blocking the main thread. It iterates through the unprocessed images, calls the appropriate processing method (processNoteOCRorprocessAttachmentOCR) for each, and increments theprocessedcount in thebatchProcessingState.Server API (
apps/server/src/routes/api/ocr.ts)This file acts as a thin routing layer, exposing the
OCRService's functionality via HTTP endpoints.processNoteOCR,batchProcessOCR,getBatchProgress) corresponds to an API endpoint.noteId).ocrService(e.g.,ocrService.startBatchProcessing()).getBatchProgressendpoint is particularly simple: it just callsocrService.getBatchProgress()and returns the in-memory state object, allowing the client to poll for updates efficiently.Client-Side UI (
apps/client/src/widgets/type_widgets/options/images/images.ts)This widget provides the user interface for interacting with the OCR features.
startBatchOcr): When the user clicks the "Start Batch OCR" button, this function is called. It first makes aPOSTrequest to the/api/ocr/batch-processendpoint to initiate the process on the server.pollBatchOcrProgress): Upon a successful response from the server, it begins polling. It calls itself recursively usingsetTimeoutevery second. In each call, it makes aGETrequest to/api/ocr/batch-progress.inProgress: false, it stops the polling loop and displays a completion message.Data Flow (Mermaid Diagram)
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.