This package browser is in early development. Mind the rough edges.

docx2txt

Recover text from `.docx' files, with good formatting

docx2txt is a Perl based command line utility to convert Microsoft Office .docx documents to equivalent text documents. Latest version supports following features during text extraction.

  • Character conversions; currency characters are converted to respective names like Euro.

  • Capitalisation of text blocks.

  • Center and right justification of text fitting in a line of (configurable) 80 columns.

  • Horizontal ruler, line breaks, paragraphs separation, tabs.

  • Indicating hyperlinked text along with the hyperlink (configurable).

  • Handling (bullet, decimal, letter, roman) lists along with (attempt at) indentation.

Installation

Install the latest version of docx2txt as follows:

guix install docx2txt

Or install a particular version:

guix install docx2txt@1.4

You can also install packages in augmented, pure or containerized environments for development or simply to try them out without polluting your user profile. See the guix shell documentation for more information.