Automated Word Document Translation with OpenAI API: Preserving Styles and Formatting

Introduction

During a recent meeting, the topic of translating Japanese instruction manuals into English arose. As the discussion leaned towards manual translation efforts, I proposed an alternative solution utilizing Python's capabilities. This article introduces a script I developed to automate this process.

The script reads the original Word file, translates its content from Japanese to English using OpenAI's API, and generates a new Word document while maintaining the original formatting.

Script overview

  • Word Document Translation: The script reads an input Word document, translates its content, and saves it as a new Word document.

  • Format Preservation: It maintains the original document's font styles, colors, sizes, and alignments.

  • Chunk Processing: To circumvent API limitations, the script divides long documents into appropriately sized chunks for processing.

  • Customization Options: Users can specify different OpenAI models and temperature settings via command-line arguments.

  • Progress Tracking: The script utilizes the tqdm library to display real-time translation progress.

Environment Setup

Obtaining an OpenAI API Key

Create an account on the OpenAI website and acquire an API key. For detailed instructions, please refer to this article.

https://qiita.com/kofumi/items/16a9a501ffc8dd49da50

Conda Environment Setup

Create a file named environment_word_translator.yml with the following content:

# environment_word_translator.yml
name: word-translator
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.12
  - python-dotenv
  - openai
  - python-docx
  - tqdm

Then, create the conda environment using:

conda env create -f environment_word_translator.yml

Script

import os
import argparse
from dotenv import load_dotenv
from typing import List, Tuple, Optional
from docx import Document
from docx.shared import RGBColor, Inches
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.text.paragraph import Paragraph
from openai import OpenAI
from tqdm import tqdm

# Load environment variables from .env file
load_dotenv()

# OpenAI API key configuration
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
MAX_TOKENS = 4000  # Adjusted for GPT-4's capacity

def split_text(text: str, max_tokens: int) -> List[str]:
    """Split text into chunks that don't exceed max_tokens."""
    words = text.split()
    chunks = []
    current_chunk = []

    for word in words:
        if len(" ".join(current_chunk + [word])) > max_tokens:
            chunks.append(" ".join(current_chunk))
            current_chunk = [word]
        else:
            current_chunk.append(word)

    if current_chunk:
        chunks.append(" ".join(current_chunk))

    return chunks

def translate_text(client: OpenAI, text: str, model: str, temperature: float) -> str:
    """Translate text to B2 level academic English using OpenAI's GPT-4 model."""
    chunks = split_text(text, MAX_TOKENS)
    translated_chunks = []

    for chunk in tqdm(chunks, desc="Translating chunks", unit="chunk"):
        messages = [
            {"role": "system", "content": """You are a professional translator specializing in academic writing. Your task is to translate the following text to English at a B2 level (upper intermediate) with an academic tone. Follow these guidelines:

1. Maintain the original meaning and style of the text.
2. Use appropriate academic vocabulary and structures, but avoid overly complex or obscure terms.
3. Ensure diversity in word choice and sentence structures. Avoid repetitive use of certain words or phrases.
4. Do not use overly formal or archaic language. Aim for clarity and readability.
5. Preserve the original tone and intent of the text, whether it's persuasive, informative, or analytical.
6. Be aware of context and field-specific terminology, translating them accurately.
7. Avoid using cliché phrases or expressions that are overused in AI-generated text.
8. If the original text includes colloquialisms or idioms, translate them into appropriate English equivalents that maintain the intended meaning and tone.
9. Pay attention to nuances and connotations in the original text, and strive to convey these in the translation.
10. If you encounter any ambiguities in the original text, translate in a way that preserves that ambiguity rather than making assumptions.

Translate the following text, keeping these guidelines in mind:"""},
            {"role": "user", "content": chunk}
        ]
        
        try:
            response = client.chat.completions.create(
                messages=messages,
                model=model,
                max_tokens=MAX_TOKENS,
                n=1,
                stop=None,
                temperature=temperature,
            )
            translated_chunks.append(response.choices[0].message.content.strip())
        except Exception as e:
            print(f"Translation error: {e}")
            translated_chunks.append(chunk)

    return " ".join(translated_chunks)

def process_paragraph(client: OpenAI, paragraph: Paragraph, model: str, temperature: float) -> Tuple[str, List[Tuple]]:
    """Process a paragraph and its runs, returning translated text and formatting info."""
    translated_text = translate_text(client, paragraph.text, model, temperature)
    
    formatting = []
    for run in paragraph.runs:
        formatting.append((
            run.bold, run.italic, run.underline,
            run.font.name or "Default", run.font.size,
            run.font.color.rgb if run.font.color.rgb else RGBColor(0, 0, 0)
        ))
    
    return translated_text, formatting

def apply_formatting(paragraph: Paragraph, text: str, formatting: List[Tuple]) -> None:
    """Apply formatting to a paragraph based on the original formatting."""
    paragraph.text = ""
    words = text.split()
    format_index = 0
    current_run = paragraph.add_run()

    for word in words:
        if format_index < len(formatting):
            bold, italic, underline, font_name, font_size, color = formatting[format_index]
            current_run = paragraph.add_run(word + " ")
            current_run.bold = bold
            current_run.italic = italic
            current_run.underline = underline
            current_run.font.name = font_name
            if font_size:
                current_run.font.size = font_size
            current_run.font.color.rgb = color

            # Move to the next format if the current run is longer than the original
            if len(current_run.text) >= len(formatting[format_index][3]):
                format_index += 1
        else:
            current_run.add_text(word + " ")

def get_document_margins(doc: Document) -> Tuple[float, float, float, float]:
    """Get the margins of the document in inches."""
    section = doc.sections[0]
    return (
        section.top_margin.inches,
        section.bottom_margin.inches,
        section.left_margin.inches,
        section.right_margin.inches
    )

def set_document_margins(doc: Document, margins: Tuple[float, float, float, float]) -> None:
    """Set the margins of the document in inches."""
    section = doc.sections[0]
    section.top_margin = Inches(margins[0])
    section.bottom_margin = Inches(margins[1])
    section.left_margin = Inches(margins[2])
    section.right_margin = Inches(margins[3])

def translate_word_document(input_path: str, output_path: str, api_key: str, model: str, temperature: float) -> None:
    """Translate a Word document to B2 level academic English, preserving formatting."""
    client = OpenAI(api_key=OPENAI_API_KEY)
    
    doc = Document(input_path)
    translated_doc = Document()

    # Get and set margins
    original_margins = get_document_margins(doc)
    set_document_margins(translated_doc, original_margins)

    for paragraph in tqdm(doc.paragraphs, desc="Processing paragraphs", unit="paragraph"):
        translated_text, formatting = process_paragraph(client, paragraph, model, temperature)
        translated_paragraph = translated_doc.add_paragraph()
        apply_formatting(translated_paragraph, translated_text, formatting)
        translated_paragraph.alignment = paragraph.alignment
        translated_paragraph.style = paragraph.style

    translated_doc.save(output_path)
    print(f"Translated Word document saved to {output_path}")
    print(f"Original margins (top, bottom, left, right): {original_margins}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Translate Word documents to English.")
    parser.add_argument("input_file", help="Path to the input file to be translated")
    parser.add_argument("--model", default="gpt-4", help="OpenAI model to use (default: gpt-4)")
    parser.add_argument("--temperature", type=float, default=0.5, help="Generation temperature (default: 0.5)")
    args = parser.parse_args()

    api_key = OPENAI_API_KEY
    if not api_key:
        raise ValueError("OPENAI_API_KEY is not set.")

    input_file = args.input_file
    output_dir = os.path.dirname(input_file)
    output_file = os.path.join(output_dir, f"translated_{os.path.basename(input_file)}")

    translate_word_document(input_file, output_file, api_key, args.model, args.temperature)

.env

Create a .env file with your OpenAI API key:

# .env
OPENAI_API_KEY=sk-your_key

Usage Instructions

Help Command

To view the help menu, use:

python Japanese-word-file-translate-to-English.py --help
Japanease-word-file-translate-to-English.py --help
usage: Japanease-word-file-translate-to-English.py [-h] [--model MODEL] [--temperature TEMPERATURE] input_file

Translate Word documents to English.

positional arguments:
  input_file            Path to the input file to be translated

options:
  -h, --help            show this help message and exit
  --model MODEL         OpenAI model to use (default: gpt-4o)
  --temperature TEMPERATURE
                        Generation temperature (default: 0.5)#

Example Usage

# Note: If the file path contains spaces, enclose it in quotation marks.
python Japanease-word-file-translate-to-English.py "Path/to/your/file/sss sss.docx"

Conclusion

This Python script enables high-quality translation of Word documents from Japanese to English in a matter of seconds, even for lengthy texts. The cost is approximately 100 yen per document (based on personal expenses).

The goal is to allocate more time for educational and research activities by streamlining administrative tasks. I hope this contributes to a more productive academic environment.

The progress of AI technology is truly remarkable. I encourage you to leverage such AI technologies to create more efficient work environments.

Your feedback, questions, and suggestions for improvement are welcome and appreciated. Please feel free to leave a comment below.

いいなと思ったら応援しよう!