#dominoforever | Product Ideas Portal

 

Welcome to the #dominoforever Product Ideas Forum! The place where you can submit product ideas and enhancement request. We encourage you to participate by voting on, commenting on, and creating new ideas. All new ideas will be evaluated by HCL Product Management & Engineering teams, and the next steps will be communicated. While not all submitted ideas will be executed upon, community feedback will play a key role in influencing which ideas are and when they will be implemented.

For more information and upcoming events around #dominoforever, please visit our Destination Domino Page

Decrease the memory consumption of java.exe due to enabling apache Tika

From R10 Documentation

The Domino® server and Notes® standard client use Apache Tika 1.18 open source conversion filters to extract text for full-text searches of attachments. Tika replaces the KeyView conversion filter used in previous releases.

The implementation of Tika supports the ability to:

  • Search a wide range of formats, including container files such as .zip and .tar files.
  • Search ASCII text files that contain UTF-8 encoding.
  • Customize which attachment types can be full-text indexed and the maximum attachment size allowed for full-text indexing.

Tika runs as a Java™ process when you start the Notes standard client or Domino. The process calls tika-server.jar, which starts the HTTP task and listens for text extraction requests on port 9998, by default. If you upgrade to Notes or Domino 10, full-text indexes that previously used KeyView filters to extract text are rebuilt using the Tika filters.

For the list of file formats supported by Tika 1.18, see the Apache Tika web site.

 

When we create FTI in 10.0.1 the java.exe process will start as it covers a broad range of attachments it will consume more memory.

  • Guest
  • Mar 13 2019
  • Already exists
  • Attach files
  • Guest commented
    11 Jan, 2020 12:29pm

    Tika is a 3rd party application. It is maintained by the Apache Tika project.

    The indexing needs memory specially when operations run in parallel.

    We had the issue the other side round that the Tika process didn't have sufficient memory assinged for large number of big PDF attachments.

     

    Tika is regularly updated in Notes/Domino for example Domino 10.0.1 and 11.0 got a newer Tika Version and there are also new settings to tune the memory usage.

     

    This is an ongoing process which doesn't need a separate AHA idea. If you have updated to Domino 10.0.1 FP3 and you are still seeing issues, you should open a support ticket.

    Tika integration is still quite new and there is ongoing work.  And we saw improvements in the latest versions.

    [ Daniel Nashed / http://blog.nashcom.de ]

  • Admin
    Thomas Hampel commented
    18 Apr, 2019 12:30pm

    Shortened title - please confirm if it is correct

  • Guest commented
    14 Mar, 2019 06:26am

    Just to clarify, you are asking to decrease the memory consumption of java.exe due to enabling apache Tika ?

    Tinus Riyanto - Prisma Global Solusi

  • Guest commented
    13 Mar, 2019 07:57pm

    Love the short title :-)