Text-related objects¶
Paragraph
objects¶
-
class
docx.text.paragraph.
Paragraph
[source]¶ Proxy object wrapping
<w:p>
element.-
add_run
(text=None, style=None)[source]¶ Append a run to this paragraph containing text and having character style identified by style ID style. text can contain tab (
\t
) characters, which are converted to the appropriate XML form for a tab. text can also include newline (\n
) or carriage return (\r
) characters, each of which is converted to a line break.
-
alignment
¶ A member of the WD_PARAGRAPH_ALIGNMENT enumeration specifying the justification setting for this paragraph. A value of
None
indicates the paragraph has no directly-applied alignment value and will inherit its alignment value from its style hierarchy. AssigningNone
to this property removes any directly-applied alignment value.
-
clear
()[source]¶ Return this same paragraph after removing all its content. Paragraph-level formatting, such as style, is preserved.
-
insert_paragraph_before
(text=None, style=None)[source]¶ Return a newly created paragraph, inserted directly before this paragraph. If text is supplied, the new paragraph contains that text in a single run. If style is provided, that style is assigned to the new paragraph.
-
paragraph_format
¶ The
ParagraphFormat
object providing access to the formatting properties for this paragraph, such as line spacing and indentation.
-
style
¶ Read/Write.
_ParagraphStyle
object representing the style assigned to this paragraph. If no explicit style is assigned to this paragraph, its value is the default paragraph style for the document. A paragraph style name can be assigned in lieu of a paragraph style object. AssigningNone
removes any applied style, making its effective value the default paragraph style for the document.
-
text
¶ String formed by concatenating the text of each run in the paragraph. Tabs and line breaks in the XML are mapped to
\t
and\n
characters respectively.Assigning text to this property causes all existing paragraph content to be replaced with a single run containing the assigned text. A
\t
character in the text is mapped to a<w:tab/>
element and each\n
or\r
character is mapped to a line break. Paragraph-level formatting, such as style, is preserved. All run-level formatting, such as bold or italic, is removed.
-
ParagraphFormat
objects¶
-
class
docx.text.parfmt.
ParagraphFormat
[source]¶ Provides access to paragraph formatting such as justification, indentation, line spacing, space before and after, and widow/orphan control.
-
alignment
¶ A member of the WD_PARAGRAPH_ALIGNMENT enumeration specifying the justification setting for this paragraph. A value of
None
indicates paragraph alignment is inherited from the style hierarchy.
-
first_line_indent
¶ Length
value specifying the relative difference in indentation for the first line of the paragraph. A positive value causes the first line to be indented. A negative value produces a hanging indent.None
indicates first line indentation is inherited from the style hierarchy.
-
keep_together
¶ True
if the paragraph should be kept “in one piece” and not broken across a page boundary when the document is rendered.None
indicates its effective value is inherited from the style hierarchy.
-
keep_with_next
¶ True
if the paragraph should be kept on the same page as the subsequent paragraph when the document is rendered. For example, this property could be used to keep a section heading on the same page as its first paragraph.None
indicates its effective value is inherited from the style hierarchy.
-
left_indent
¶ Length
value specifying the space between the left margin and the left side of the paragraph.None
indicates the left indent value is inherited from the style hierarchy. Use anInches
value object as a convenient way to apply indentation in units of inches.
-
line_spacing
¶ float
orLength
value specifying the space between baselines in successive lines of the paragraph. A value ofNone
indicates line spacing is inherited from the style hierarchy. A float value, e.g.2.0
or1.75
, indicates spacing is applied in multiples of line heights. ALength
value such asPt(12)
indicates spacing is a fixed height. ThePt
value class is a convenient way to apply line spacing in units of points. AssigningNone
resets line spacing to inherit from the style hierarchy.
-
line_spacing_rule
¶ A member of the WD_LINE_SPACING enumeration indicating how the value of
line_spacing
should be interpreted. Assigning any of the WD_LINE_SPACING membersSINGLE
,DOUBLE
, orONE_POINT_FIVE
will cause the value ofline_spacing
to be updated to produce the corresponding line spacing.
-
page_break_before
¶ True
if the paragraph should appear at the top of the page following the prior paragraph.None
indicates its effective value is inherited from the style hierarchy.
-
right_indent
¶ Length
value specifying the space between the right margin and the right side of the paragraph.None
indicates the right indent value is inherited from the style hierarchy. Use aCm
value object as a convenient way to apply indentation in units of centimeters.
-
space_after
¶ Length
value specifying the spacing to appear between this paragraph and the subsequent paragraph.None
indicates this value is inherited from the style hierarchy.Length
objects provide convenience properties, such aspt
andinches
, that allow easy conversion to various length units.
-
space_before
¶ Length
value specifying the spacing to appear between this paragraph and the prior paragraph.None
indicates this value is inherited from the style hierarchy.Length
objects provide convenience properties, such aspt
andcm
, that allow easy conversion to various length units.
-
widow_control
¶ True
if the first and last lines in the paragraph remain on the same page as the rest of the paragraph when Word repaginates the document.None
indicates its effective value is inherited from the style hierarchy.
-
Run
objects¶
-
class
docx.text.run.
Run
[source]¶ Proxy object wrapping
<w:r>
element. Several of the properties on Run take a tri-state value,True
,False
, orNone
.True
andFalse
correspond to on and off respectively.None
indicates the property is not specified directly on the run and its effective value is taken from the style hierarchy.-
add_break
(break_type=6)[source]¶ Add a break element of break_type to this run. break_type can take the values WD_BREAK.LINE, WD_BREAK.PAGE, and WD_BREAK.COLUMN where WD_BREAK is imported from docx.enum.text. break_type defaults to WD_BREAK.LINE.
-
add_picture
(image_path_or_stream, width=None, height=None)[source]¶ Return an
InlineShape
instance containing the image identified by image_path_or_stream, added to the end of this run. image_path_or_stream can be a path (a string) or a file-like object containing a binary image. If neither width nor height is specified, the picture appears at its native size. If only one is specified, it is used to compute a scaling factor that is then applied to the unspecified dimension, preserving the aspect ratio of the image. The native size of the picture is calculated using the dots-per-inch (dpi) value specified in the image file, defaulting to 72 dpi if no value is specified, as is often the case.
-
add_tab
()[source]¶ Add a
<w:tab/>
element at the end of the run, which Word interprets as a tab character.
-
add_text
(text)[source]¶ Returns a newly appended
_Text
object (corresponding to a new<w:t>
child element) to the run, containing text. Compare with the possibly more friendly approach of assigning text to theRun.text
property.
-
bold
¶ Read/write. Causes the text of the run to appear in bold.
-
clear
()[source]¶ Return reference to this run after removing all its content. All run formatting is preserved.
-
font
¶ The
Font
object providing access to the character formatting properties for this run, such as font name and size.
-
italic
¶ Read/write tri-state value. When
True
, causes the text of the run to appear in italics.
-
style
¶ Read/write. A
_CharacterStyle
object representing the character style applied to this run. The default character style for the document (often Default Character Font) is returned if the run has no directly-applied character style. Setting this property toNone
removes any directly-applied character style.
-
text
¶ String formed by concatenating the text equivalent of each run content child element into a Python string. Each
<w:t>
element adds the text characters it contains. A<w:tab/>
element adds a\t
character. A<w:cr/>
or<w:br>
element each add a\n
character. Note that a<w:br>
element can indicate a page break or column break as well as a line break. All<w:br>
elements translate to a single\n
character regardless of their type. All other content child elements, such as<w:drawing>
, are ignored.Assigning text to this property has the reverse effect, translating each
\t
character to a<w:tab/>
element and each\n
or\r
character to a<w:cr/>
element. Any existing run content is replaced. Run formatting is preserved.
-
underline
¶ The underline style for this
Run
, one ofNone
,True
,False
, or a value from WD_UNDERLINE. A value ofNone
indicates the run has no directly-applied underline value and so will inherit the underline value of its containing paragraph. AssigningNone
to this property removes any directly-applied underline value. A value ofFalse
indicates a directly-applied setting of no underline, overriding any inherited value. A value ofTrue
indicates single underline. The values from WD_UNDERLINE are used to specify other outline styles such as double, wavy, and dotted.
-
Font
objects¶
-
class
docx.text.run.
Font
[source]¶ Proxy object wrapping the parent of a
<w:rPr>
element and providing access to character properties such as font name, font size, bold, and subscript.-
all_caps
¶ Read/write. Causes text in this font to appear in capital letters.
-
bold
¶ Read/write. Causes text in this font to appear in bold.
-
color
¶ A
ColorFormat
object providing a way to get and set the text color for this font.
-
complex_script
¶ Read/write tri-state value. When
True
, causes the characters in the run to be treated as complex script regardless of their Unicode values.
-
cs_bold
¶ Read/write tri-state value. When
True
, causes the complex script characters in the run to be displayed in bold typeface.
-
cs_italic
¶ Read/write tri-state value. When
True
, causes the complex script characters in the run to be displayed in italic typeface.
-
double_strike
¶ Read/write tri-state value. When
True
, causes the text in the run to appear with double strikethrough.
-
emboss
¶ Read/write tri-state value. When
True
, causes the text in the run to appear as if raised off the page in relief.
-
hidden
¶ Read/write tri-state value. When
True
, causes the text in the run to be hidden from display, unless applications settings force hidden text to be shown.
-
highlight_color
¶ A member of WD_COLOR_INDEX indicating the color of highlighting applied, or None if no highlighting is applied.
-
imprint
¶ Read/write tri-state value. When
True
, causes the text in the run to appear as if pressed into the page.
-
italic
¶ Read/write tri-state value. When
True
, causes the text of the run to appear in italics.None
indicates the effective value is inherited from the style hierarchy.
-
math
¶ Read/write tri-state value. When
True
, specifies this run contains WML that should be handled as though it was Office Open XML Math.
-
name
¶ Get or set the typeface name for this
Font
instance, causing the text it controls to appear in the named font, if a matching font is found.None
indicates the typeface is inherited from the style hierarchy.
-
no_proof
¶ Read/write tri-state value. When
True
, specifies that the contents of this run should not report any errors when the document is scanned for spelling and grammar.
-
outline
¶ Read/write tri-state value. When
True
causes the characters in the run to appear as if they have an outline, by drawing a one pixel wide border around the inside and outside borders of each character glyph.
-
rtl
¶ Read/write tri-state value. When
True
causes the text in the run to have right-to-left characteristics.
-
shadow
¶ Read/write tri-state value. When
True
causes the text in the run to appear as if each character has a shadow.
-
size
¶ Read/write
Length
value orNone
, indicating the font height in English Metric Units (EMU).None
indicates the font size should be inherited from the style hierarchy.Length
is a subclass ofint
having properties for convenient conversion into points or other length units. Thedocx.shared.Pt
class allows convenient specification of point values:>> font.size = Pt(24) >> font.size 304800 >> font.size.pt 24.0
-
small_caps
¶ Read/write tri-state value. When
True
causes the lowercase characters in the run to appear as capital letters two points smaller than the font size specified for the run.
-
snap_to_grid
¶ Read/write tri-state value. When
True
causes the run to use the document grid characters per line settings defined in the docGrid element when laying out the characters in this run.
-
spec_vanish
¶ Read/write tri-state value. When
True
, specifies that the given run shall always behave as if it is hidden, even when hidden text is being displayed in the current document. The property has a very narrow, specialized use related to the table of contents. Consult the spec (§17.3.2.36) for more details.
-
strike
¶ Read/write tri-state value. When
True
causes the text in the run to appear with a single horizontal line through the center of the line.
-
subscript
¶ Boolean indicating whether the characters in this
Font
appear as subscript.None
indicates the subscript/subscript value is inherited from the style hierarchy.
-
superscript
¶ Boolean indicating whether the characters in this
Font
appear as superscript.None
indicates the subscript/superscript value is inherited from the style hierarchy.
-
underline
¶ The underline style for this
Font
, one ofNone
,True
,False
, or a value from WD_UNDERLINE.None
indicates the font inherits its underline value from the style hierarchy.False
indicates no underline.True
indicates single underline. The values from WD_UNDERLINE are used to specify other outline styles such as double, wavy, and dotted.
-
web_hidden
¶ Read/write tri-state value. When
True
, specifies that the contents of this run shall be hidden when the document is displayed in web page view.
-
TabStop
objects¶
-
class
docx.text.tabstops.
TabStop
[source]¶ An individual tab stop applying to a paragraph or style. Accessed using list semantics on its containing
TabStops
object.-
alignment
¶ A member of WD_TAB_ALIGNMENT specifying the alignment setting for this tab stop. Read/write.
-
leader
¶ A member of WD_TAB_LEADER specifying a repeating character used as a “leader”, filling in the space spanned by this tab. Assigning
None
produces the same result as assigning WD_TAB_LEADER.SPACES. Read/write.
-
TabStops
objects¶
-
class
docx.text.tabstops.
TabStops
[source]¶ A sequence of
TabStop
objects providing access to the tab stops of a paragraph or paragraph style. Supports iteration, indexed access, del, and len(). It is accesed using thetab_stops
property of ParagraphFormat; it is not intended to be constructed directly.-
add_tab_stop
(position, alignment=WD_TAB_ALIGNMENT.LEFT, leader=WD_TAB_LEADER.SPACES)[source]¶ Add a new tab stop at position, a
Length
object specifying the location of the tab stop relative to the paragraph edge. A negative position value is valid and appears in hanging indentation. Tab alignment defaults to left, but may be specified by passing a member of the WD_TAB_ALIGNMENT enumeration as alignment. An optional leader character can be specified by passing a member of the WD_TAB_LEADER enumeration as leader.
-