Class | Text::Format |
In: |
lib/text/format.rb
|
Parent: | Object |
Text::Format provides the ability to nicely format fixed-width text with knowledge of the writeable space (number of columns), margins, and indentation settings.
Copyright: | Copyright © 2002 - 2005 by Austin Ziegler |
Version: | 1.0.0 |
Based On: | Perl Text::Format, Copyright © 1998 Gábor Egressy |
Licence: | Ruby’s, Perl Artistic, or GPL version 2 (or later) |
VERSION | = | '1.0.0' | ||
SPACES_RE | = | %r{\s+}mo.freeze | ||
NEWLINE_RE | = | %r{\n}o.freeze | ||
TAB | = | "\t".freeze | ||
NEWLINE | = | "\n".freeze | ||
ABBREV | = | %w(Mr Mrs Ms Jr Sr Dr) | Global common English abbreviations. More can be added with abbreviations. | |
LEFT_ALIGN | = | :left |
Formats text flush to the left margin with a visual and physical ragged
right margin.
>A paragraph that is< >left aligned.< |
|
RIGHT_ALIGN | = | :right |
Formats text flush to the right margin with a visual ragged left margin.
The actual left margin is padded with spaces from the beginning of the line
to the start of the text such that the right margin will be flush.
>A paragraph that is< > right aligned.< |
|
RIGHT_FILL | = | :fill |
Formats text flush to the left margin with a visual ragged right margin.
The line is padded with spaces from the end of the text to the right
margin.
>A paragraph that is< >right filled. < |
|
JUSTIFY | = | :justify |
Formats the text flush to both the left and right margins. The last line
will not be justified if it consists of a single word (it will be treated
as RIGHT_FILL in this case). Spacing between words is increased to
ensure that the textg is flush with both margins.
|A paragraph that| |is justified.| |A paragraph that is| |justified. | |
|
SPLIT_FIXED | = | 1 |
When hard_margins is enabled, a word that extends over the right margin
will be split at the number of characters needed. This is similar to how
characters wrap on a terminal. This is the default split mechanism when
hard_margins is enabled.
repre senta ion |
|
SPLIT_CONTINUATION | = | 2 |
When hard_margins is enabled, a word that extends over the right margin
will be split at one less than the number of characters needed with a
C-style continuation character (\). If the word cannot be split using the
rules of SPLIT_CONTINUATION, and the word will not fit wholly into the next
line, then SPLIT_FIXED will be used.
repr # esen # tati # on |
|
SPLIT_HYPHENATION | = | 4 |
When hard_margins is enabled, a word that extends over the right margin
will be split according to the hyphenator specified by the hyphenator
object; if there is no hyphenation library supplied, then the hyphenator of
Text::Format itself is used, which is the same as
SPLIT_CONTINUATION. See hyphenator for more information about hyphenation
libraries. The example below is valid with either TeX::Hyphen or
Text::Hyphen. If the word cannot be split using the hyphenator’s
rules, and the word will not fit wholly into the next line, then
SPLIT_FIXED will be used.
rep- re- sen- ta- tion |
|
SPLIT_CONTINUATION_FIXED | = | SPLIT_CONTINUATION | SPLIT_FIXED | When hard_margins is enabled, a word that extends over the right margin will be split at one less than the number of characters needed with a C-style continuation character (\). If the word cannot be split using the rules of SPLIT_CONTINUATION, then SPLIT_FIXED will be used. | |
SPLIT_HYPHENATION_FIXED | = | SPLIT_HYPHENATION | SPLIT_FIXED | When hard_margins is enabled, a word that extends over the right margin will be split according to the hyphenator specified by the hyphenator object; if there is no hyphenation library supplied, then the hyphenator of Text::Format itself is used, which is the same as SPLIT_CONTINUATION. See hyphenator for more information about hyphenation libraries. The example below is valid with either TeX::Hyphen or Text::Hyphen. If the word cannot be split using the hyphenator’s rules, then SPLIT_FIXED will be used. | |
SPLIT_HYPHENATION_CONTINUATION | = | SPLIT_HYPHENATION | SPLIT_CONTINUATION | Attempts to split words according to the rules of the supplied hyphenator (e.g., SPLIT_HYPHENATION); if the word cannot be split using these rules, then the rules of SPLIT_CONTINUATION will be followed. In all cases, if the word cannot be split using either SPLIT_HYPHENATION or SPLIT_CONTINUATION, and the word will not fit wholly into the next line, then SPLIT_FIXED will be used. | |
SPLIT_ALL | = | SPLIT_HYPHENATION | SPLIT_CONTINUATION | SPLIT_FIXED | Attempts to split words according to the rules of the supplied hyphenator (e.g., SPLIT_HYPHENATION); if the word cannot be split using these rules, then the rules of SPLIT_CONTINUATION will be followed. In all cases, if the word cannot be split using either SPLIT_HYPHENATION or SPLIT_CONTINUATION, then SPLIT_FIXED will be used. | |
TERMINAL_PUNCTUATION | = | %q(.?!) | Indicates punctuation characters that terminates a sentence, as some English typesetting rules indicate that sentences should be followed by two spaces. This is an archaic rule, but is supported with extra_space. This is the default set of terminal punctuation characters. Additional terminal punctuation may be added to the formatting object through terminal_punctuation. | |
TERMINAL_QUOTES | = | %q('") | Indicates quote characters that may follow terminal punctuation under the current formatting rules. This satisfies the English formatting rule that indicates that sentences terminated inside of quotes should have the punctuation inside of the quoted text, not outside of the terminal quote. Additional terminal quotes may be added to the formatting object through terminal_quotes. See TERMINAL_PUNCTUATION for more information. | |
RE_BREAK_SIZE | = | lambda { |size| %r[((?:\S+\s+){#{size}})(.+)] } | Returns a regular expression for a set of characters (at least one non-whitespace followed by at least one space) of the specified size followed by one or more of any character. |
abbreviations | [RW] |
Defines the current abbreviations as an array. This is only used if
extra_space is turned on.
If one is abbreviating "President" as "Pres." (abbreviations = ["Pres"]), then the results of formatting will be as illustrated in the table below: abbreviations extra_space | #include?("Pres") | not #include?("Pres") ------------+-------------------+---------------------- true | Pres. Lincoln | Pres. Lincoln false | Pres. Lincoln | Pres. Lincoln ------------+-------------------+---------------------- extra_space | #include?("Mrs") | not #include?("Mrs") true | Mrs. Lincoln | Mrs. Lincoln false | Mrs. Lincoln | Mrs. Lincoln Note that abbreviations should not have the terminal period as part of their definitions. This automatic abbreviation handling will cause some issues with uncommon sentence structures. The two sentences below will not be formatted correctly: You're in trouble now, Mr. Just wait until your father gets home. Under no circumstances (because Mr is a predefined abbreviation) will this ever be separated by two spaces.
|
||||
body_indent | [RW] |
The number of spaces to indent all lines after the first line of a
paragraph. The value provided is silently converted to a positive integer
value.
columns <--------------------------------------------------------------> <-----------><------><---------------------------><------------> left margin INDENT text is formatted into here right margin
|
||||
columns | [RW] |
The total width of the format area. The margins, indentation, and text are
formatted into this space. Any value provided is silently converted to a
positive integer.
COLUMNS <--------------------------------------------------------------> <-----------><------><---------------------------><------------> left margin indent text is formatted into here right margin
|
||||
extra_space | [RW] |
Indicates whether sentence terminators should be followed by a single space
(false), or two spaces (true). See abbreviations for more
information.
|
||||
first_indent | [RW] |
The number of spaces to indent the first line of a paragraph. The value
provided is silently converted to a positive integer value.
columns <--------------------------------------------------------------> <-----------><------><---------------------------><------------> left margin INDENT text is formatted into here right margin
|
||||
format_style | [RW] |
Specifies the format style. Allowable values are: *LEFT_ALIGN
*RIGHT_ALIGN *RIGHT_FILL *JUSTIFY
|
||||
hard_margins | [RW] |
Normally, words larger than the format area will be placed on a line by
themselves. Setting this value to true will force words larger
than the format area to be split into one or more "words" each at
most the size of the format area. The first line and the original word will
be placed into split_words. Note that this will cause the output to look
similar to a format_style of JUSTIFY. (Lines will be filled as much
as possible.)
|
||||
hyphenator | [RW] |
The object responsible for hyphenating. It must respond to hyphenate_to(word, size) or hyphenate_to(word, size, formatter) and
return an array of the word split into two parts (e.g., [part1,
part2]; if there is a hyphenation mark to be applied, responsibility
belongs to the hyphenator object. The size is the MAXIMUM size permitted,
including any hyphenation marks.
If the hyphenate_to method has an arity of 3, the current formatter (self) will be provided to the method. This allows the hyphenator to make decisions about the hyphenation based on the formatting rules. hyphenate_to should return [nil, word] if the word cannot be hyphenated.
|
||||
left_margin | [RW] |
The number of spaces used for the left margin. The value provided is
silently converted to a positive integer value.
columns <--------------------------------------------------------------> <-----------><------><---------------------------><------------> LEFT MARGIN indent text is formatted into here right margin
|
||||
nobreak | [RW] |
Indicates whether or not the non-breaking space feature should be used.
|
||||
nobreak_regex | [RW] |
A hash which holds the regular expressions on which spaces should not be
broken. The hash is set up such that the key is the first word and the
value is the second word.
For example, if nobreak_regex contains the following hash: { %r{Mrs?\.?} => %r{\S+}, %r{\S+} => %r{(?:[SJ])r\.?} } Then "Mr. Jones", "Mrs Jones", and "Jones Jr." would not be broken. If this simple matching algorithm indicates that there should not be a break at the current end of line, then a backtrack is done until there are two words on which line breaking is permitted. If two such words are not found, then the end of the line will be broken regardless. If there is a single word on the current line, then no backtrack is done and the word is stuck on the end.
|
||||
right_margin | [RW] |
The number of spaces used for the right margin. The value provided is
silently converted to a positive integer value.
columns <--------------------------------------------------------------> <-----------><------><---------------------------><------------> left margin indent text is formatted into here RIGHT MARGIN
|
||||
split_rules | [RW] |
Specifies the split mode; used only when hard_margins is set to
true. Allowable values are:
|
||||
split_words | [R] |
An array of words split during formatting if hard_margins is set to
true.
#split_words << Text::Format::SplitWord.new(word, first, rest) |
||||
tabstop | [RW] |
Indicates the number of spaces that a single tab represents. Any value
provided is silently converted to a positive integer.
|
||||
tag_paragraph | [RW] |
Indicates whether the formatting of paragraphs should be done with tagged
paragraphs. Useful only with tag_text.
|
||||
tag_text | [RW] |
The text to be placed before each paragraph when tag_paragraph is
true. When format is called,
only the first element (tag_text[0]) is used. When paragraphs is called, then each successive
element (tag_text[n]) will be used once, with corresponding paragraphs. If
the tag elements are exhausted before the text is exhausted, then the
remaining paragraphs will not be tagged. Regardless of indentation
settings, a blank line will be inserted between all paragraphs when
tag_paragraph is true.
The Text::Format package provides three number generators, Text::Format::Alpha, Text::Format::Number, and Text::Format::Roman to assist with the numbering of paragraphs.
|
||||
terminal_punctuation | [RW] |
Specifies additional punctuation characters that terminate a sentence, as
some English typesetting rules indicate that sentences should be followed
by two spaces. This is an archaic rule, but is supported with extra_space.
This is added to the default set of terminal punctuation defined in
TERMINAL_PUNCTUATION.
|
||||
terminal_quotes | [RW] |
Specifies additional quote characters that may follow terminal punctuation
under the current formatting rules. This satisfies the English formatting
rule that indicates that sentences terminated inside of quotes should have
the punctuation inside of the quoted text, not outside of the terminal
quote. This is added to the default set of terminal quotes defined in
TERMINAL_QUOTES.
|
||||
text | [RW] |
The default text to be manipulated. Note that value is optional, but if the
formatting functions are called without values, this text is what will be
formatted.
|
Create a Text::Format object. Accepts an optional hash of construction options (this will be changed to named paramters in Ruby 2.0). After the initial object is constructed (with either the provided or default values), the object will be yielded (as self) to an optional block for further construction and operation.
# File lib/text/format.rb, line 1016 1016: def initialize(options = {}) #:yields self: 1017: @text = options[:text] || [] 1018: @columns = options[:columns] || 72 1019: @tabstop = options[:tabstop] || 8 1020: @first_indent = options[:first_indent] || 4 1021: @body_indent = options[:body_indent] || 0 1022: @format_style = options[:format_style] || LEFT_ALIGN 1023: @left_margin = options[:left_margin] || 0 1024: @right_margin = options[:right_margin] || 0 1025: @extra_space = options[:extra_space] || false 1026: @tag_paragraph = options[:tag_paragraph] || false 1027: @tag_text = options[:tag_text] || [] 1028: @abbreviations = options[:abbreviations] || [] 1029: @terminal_punctuation = options[:terminal_punctuation] || "" 1030: @terminal_quotes = options[:terminal_quotes] || "" 1031: @nobreak = options[:nobreak] || false 1032: @nobreak_regex = options[:nobreak_regex] || {} 1033: @hard_margins = options[:hard_margins] || false 1034: @split_rules = options[:split_rules] || SPLIT_FIXED 1035: @hyphenator = options[:hyphenator] || self 1036: 1037: @hyphenator_arity = @hyphenator.method(:hyphenate_to).arity 1038: @tag_cur = "" 1039: @split_words = [] 1040: 1041: yield self if block_given? 1042: end
Compares the formatting rules, excepting hyphenator, of two Text::Format objects. Generated results (e.g., split_words) are not compared.
# File lib/text/format.rb, line 188 188: def ==(o) 189: (@text == o.text) and 190: (@columns == o.columns) and 191: (@left_margin == o.left_margin) and 192: (@right_margin == o.right_margin) and 193: (@hard_margins == o.hard_margins) and 194: (@split_rules == o.split_rules) and 195: (@first_indent == o.first_indent) and 196: (@body_indent == o.body_indent) and 197: (@tag_text == o.tag_text) and 198: (@tabstop == o.tabstop) and 199: (@format_style == o.format_style) and 200: (@extra_space == o.extra_space) and 201: (@tag_paragraph == o.tag_paragraph) and 202: (@nobreak == o.nobreak) and 203: (@terminal_punctuation == o.terminal_punctuation) and 204: (@terminal_quotes == o.terminal_quotes) and 205: (@abbreviations == o.abbreviations) and 206: (@nobreak_regex == o.nobreak_regex) 207: end
Centers the text, preserving empty lines and tabs.
# File lib/text/format.rb, line 728 728: def center(to_center = nil) 729: to_center = @text if to_center.nil? 730: to_center = [to_center].flatten 731: 732: tabs = 0 733: width = @columns - @left_margin - @right_margin 734: centered = [] 735: to_center.each do |tc| 736: s = tc.strip 737: tabs = s.count(TAB) 738: tabs = 0 if tabs.nil? 739: ct = ((width - s.size - (tabs * @tabstop) + tabs) / 2) 740: ct = (width - @left_margin - @right_margin) - ct 741: centered << "#{s.rjust(ct)}\n" 742: end 743: centered.join('') 744: end
Replaces all tab characters in the text with tabstop spaces.
# File lib/text/format.rb, line 747 747: def expand(to_expand = nil) 748: to_expand = @text if to_expand.nil? 749: 750: tmp = ' ' * @tabstop 751: changer = lambda do |text| 752: res = text.split(NEWLINE_RE) 753: res.collect! { |ln| ln.gsub!(/\t/o, tmp) } 754: res.join(NEWLINE) 755: end 756: 757: if to_expand.kind_of?(Array) 758: to_expand.collect { |te| changer[te] } 759: else 760: changer[to_expand] 761: end 762: end
Formats text into a nice paragraph format. The text is separated into words and then reassembled a word at a time using the settings of this Format object.
If text is nil, then the value of text will be worked on.
# File lib/text/format.rb, line 550 550: def format_one_paragraph(text = nil) 551: text ||= @text 552: text = text[0] if text.kind_of?(Array) 553: 554: # Convert the provided paragraph to a list of words. 555: words = text.split(SPACES_RE).reverse.reject { |ww| ww.nil? or ww.empty? } 556: 557: text = [] 558: 559: # Find the maximum line width and the initial indent string. 560: # TODO 20050114 - allow the left and right margins to be specified as 561: # strings. If they are strings, then we need to use the sizes of the 562: # strings. Also: allow the indent string to be set manually and 563: # indicate whether the indent string will have a following space. 564: max_line_width = @columns - @first_indent - @left_margin - @right_margin 565: indent_str = ' ' * @first_indent 566: 567: first_line = true 568: 569: if words.empty? 570: line = [] 571: line_size = 0 572: extra_space = false 573: else 574: line = [ words.pop ] 575: line_size = line[-1].size 576: extra_space = __add_extra_space?(line[-1]) 577: end 578: 579: while next_word = words.pop 580: next_word.strip! unless next_word.nil? 581: new_line_size = (next_word.size + line_size) + 1 582: 583: if extra_space 584: if (line[-1] !~ __sentence_end_re) 585: extra_space = false 586: end 587: end 588: 589: # Increase the width of the new line if there's a sentence 590: # terminator and we are applying extra_space. 591: new_line_size += 1 if extra_space 592: 593: # Will the word fit onto the current line? If so, simply append it 594: # to the end of the line. 595: 596: if new_line_size <= max_line_width 597: if line.empty? 598: line << next_word 599: else 600: if extra_space 601: line << " #{next_word}" 602: else 603: line << " #{next_word}" 604: end 605: end 606: else 607: # Forcibly wrap the line if nonbreaking spaces are turned on and 608: # there is a condition where words must be wrapped. If we have 609: # returned more than one word, readjust the word list. 610: line, next_word = __wrap_line(line, next_word) if @nobreak 611: if next_word.kind_of?(Array) 612: if next_word.size > 1 613: words.push(*(next_word.reverse)) 614: next_word = words.pop 615: else 616: next_word = next_word[0] 617: end 618: next_word.strip! unless next_word.nil? 619: end 620: 621: # Check to see if the line needs to be hyphenated. If a word has a 622: # hyphen in it (e.g., "fixed-width"), then we can ALWAYS wrap at 623: # that hyphenation, even if #hard_margins is not turned on. More 624: # elaborate forms of hyphenation will only be performed if 625: # #hard_margins is turned on. If we have returned more than one 626: # word, readjust the word list. 627: line, new_line_size, next_word = __hyphenate(line, line_size, next_word, max_line_width) 628: if next_word.kind_of?(Array) 629: if next_word.size > 1 630: words.push(*(next_word.reverse)) 631: next_word = words.pop 632: else 633: next_word = next_word[0] 634: end 635: next_word.strip! unless next_word.nil? 636: end 637: 638: text << __make_line(line, indent_str, max_line_width, next_word.nil?) unless line.nil? 639: 640: if first_line 641: first_line = false 642: max_line_width = @columns - @body_indent - @left_margin - @right_margin 643: indent_str = ' ' * @body_indent 644: end 645: 646: if next_word.nil? 647: line = [] 648: new_line_size = 0 649: else 650: line = [ next_word ] 651: new_line_size = next_word.size 652: end 653: end 654: 655: line_size = new_line_size 656: extra_space = __add_extra_space?(next_word) unless next_word.nil? 657: end 658: 659: loop do 660: break if line.nil? or line.empty? 661: line, line_size, ww = __hyphenate(line, line_size, ww, max_line_width)#if @hard_margins 662: text << __make_line(line, indent_str, max_line_width, ww.nil?) 663: line = ww 664: ww = nil 665: end 666: 667: if (@tag_paragraph and (not text.empty?)) 668: if @tag_cur.nil? or @tag_cur.empty? 669: @tag_cur = @tag_text[0] 670: end 671: 672: fchar = /(\S)/o.match(text[0])[1] 673: white = text[0].index(fchar) 674: 675: unless @tag_cur.nil? 676: if ((white - @left_margin - 1) > @tag_cur.size) then 677: white = @tag_cur.size + @left_margin 678: text[0].gsub!(/^ {#{white}}/, "#{' ' * @left_margin}#{@tag_cur}") 679: else 680: text.unshift("#{' ' * @left_margin}#{@tag_cur}\n") 681: end 682: end 683: end 684: 685: text.join('') 686: end
The formatting object itself can be used as a hyphenator, where the default implementation of hyphenate_to implements the conditions necessary to properly produce SPLIT_CONTINUATION.
# File lib/text/format.rb, line 531 531: def hyphenate_to(word, size) 532: if (size - 2) < 0 533: [nil, word] 534: else 535: [word[0 .. (size - 2)] + "\\", word[(size - 1) .. -1]] 536: end 537: end
Indicates that the format style is full justification.
Default: | false |
Used in: | format, paragraphs |
# File lib/text/format.rb, line 524 524: def justify? 525: @format_style == JUSTIFY 526: end
Indicates that the format style is left alignment.
Default: | true |
Used in: | format, paragraphs |
# File lib/text/format.rb, line 500 500: def left_align? 501: @format_style == LEFT_ALIGN 502: end
Considers each element of text (provided or internal) as a paragraph. If first_indent is the same as body_indent, then paragraphs will be separated by a single empty line in the result; otherwise, the paragraphs will follow immediately after each other. Uses format to do the heavy lifting.
If to_wrap responds to split, then it will be split into an array of elements by calling split with the value of split_on. The default value of split_on is $/, or the default record separator, repeated twice (e.g., /\n\n/).
# File lib/text/format.rb, line 699 699: def paragraphs(to_wrap = nil, split_on = /(#{$/}){2}/o) 700: to_wrap = @text if to_wrap.nil? 701: if to_wrap.respond_to?(:split) 702: to_wrap = to_wrap.split(split_on) 703: else 704: to_wrap = [to_wrap].flatten 705: end 706: 707: if ((@first_indent == @body_indent) or @tag_paragraph) then 708: p_end = NEWLINE 709: else 710: p_end = '' 711: end 712: 713: cnt = 0 714: ret = [] 715: to_wrap.each do |tw| 716: @tag_cur = @tag_text[cnt] if @tag_paragraph 717: @tag_cur = '' if @tag_cur.nil? 718: line = format(tw) 719: ret << "#{line}#{p_end}" if (not line.nil?) and (line.size > 0) 720: cnt += 1 721: end 722: 723: ret[-1].chomp! unless ret.empty? 724: ret.join('') 725: end
Indicates that the format style is right alignment.
Default: | false |
Used in: | format, paragraphs |
# File lib/text/format.rb, line 508 508: def right_align? 509: @format_style == RIGHT_ALIGN 510: end
Indicates that the format style is right fill.
Default: | false |
Used in: | format, paragraphs |
# File lib/text/format.rb, line 516 516: def right_fill? 517: @format_style == RIGHT_FILL 518: end
Splits the provided word so that it is in two parts, word[0 .. (size - 1)] and word[size .. -1].
# File lib/text/format.rb, line 541 541: def split_word_to(word, size) 542: [word[0 .. (size - 1)], word[size .. -1]] 543: end
Replaces all occurrences of tabstop consecutive spaces with a tab character.
# File lib/text/format.rb, line 766 766: def unexpand(to_unexpand = nil) 767: to_unexpand = @text if to_unexpand.nil? 768: 769: tmp = / {#{@tabstop}}/ 770: changer = lambda do |text| 771: res = text.split(NEWLINE_RE) 772: res.collect! { |ln| ln.gsub!(tmp, TAB) } 773: res.join(NEWLINE) 774: end 775: 776: if to_unexpand.kind_of?(Array) 777: to_unexpand.collect { |tu| changer[tu] } 778: else 779: changer[to_unexpand] 780: end 781: end
Return true if the word may have an extra space added after it. This will only be the case if extra_space is true and the word is not an abbreviation.
# File lib/text/format.rb, line 786 786: def __add_extra_space?(word) 787: return false unless @extra_space 788: word = word.gsub(/\.$/o, '') unless word.nil? 789: return false if ABBREV.include?(word) 790: return false if @abbreviations.include?(word) 791: true 792: end
This method returns the regular expression used to detect the end of a sentence under the current definition of TERMINAL_PUNCTUATION, terminal_punctuation, TERMINAL_QUOTES, and terminal_quotes.
# File lib/text/format.rb, line 175 175: def __sentence_end_re 176: %r{[#{TERMINAL_PUNCTUATION}#{self.terminal_punctuation}][#{TERMINAL_QUOTES}#{self.terminal_quotes}]?$} 177: end
The line must be broken. Typically, this is done by moving the last word on the current line to the next line. However, it may be possible that certain combinations of words may not be broken (see nobreak_regex for more information). Therefore, it may be necessary to move multiple words from the current line to the next line. This function does this.
# File lib/text/format.rb, line 972 972: def __wrap_line(line, next_word) 973: no_break = false 974: 975: word_index = line.size - 1 976: 977: @nobreak_regex.each_pair do |first, second| 978: if line[word_index] =~ first and next_word =~ second 979: no_break = true 980: end 981: end 982: 983: # If the last word and the next word aren't to be broken, and the line 984: # has more than one word in it, then we need to go back by words to 985: # ensure that we break as allowed. 986: if no_break and word_index.nonzero? 987: word_index -= 1 988: 989: while word_index.nonzero? 990: no_break = false 991: @nobreak_regex.each_pair { |first, second| 992: if line[word_index] =~ first and line[word_index + 1] =~ second 993: no_break = true 994: end 995: } 996: 997: break unless no_break 998: word_index -= 1 999: end 1000: 1001: if word_index.nonzero? 1002: words = line.slice!(word_index .. -1) 1003: words << next_word 1004: end 1005: end 1006: 1007: [line, words] 1008: end