org.apache.poi.hslf.extractor
Class PowerPointExtractor

java.lang.Object
  extended by org.apache.poi.hslf.extractor.PowerPointExtractor

public class PowerPointExtractor
extends java.lang.Object

This class can be used to extract text from a PowerPoint file. Can optionally also get the notes from one.

Author:
Nick Burch

Constructor Summary
PowerPointExtractor(HSLFSlideShow ss)
          Creates a PowerPointExtractor, from a HSLFSlideShow
PowerPointExtractor(java.io.InputStream iStream)
          Creates a PowerPointExtractor, from an Input Stream
PowerPointExtractor(POIFSFileSystem fs)
          Creates a PowerPointExtractor, from an open POIFSFileSystem
PowerPointExtractor(java.lang.String fileName)
          Creates a PowerPointExtractor, from a file
 
Method Summary
 void close()
          Shuts down the underlying streams
 java.lang.String getNotes()
          Fetches all the notes text from the slideshow, but not the slide text
 java.lang.String getText()
          Fetches all the slide text from the slideshow, but not the notes
 java.lang.String getText(boolean getSlideText, boolean getNoteText)
          Fetches text from the slideshow, be it slide text or note text.
static void main(java.lang.String[] args)
          Basic extractor.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PowerPointExtractor

public PowerPointExtractor(java.lang.String fileName)
                    throws java.io.IOException
Creates a PowerPointExtractor, from a file

Parameters:
fileName - The name of the file to extract from
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(java.io.InputStream iStream)
                    throws java.io.IOException
Creates a PowerPointExtractor, from an Input Stream

Parameters:
iStream - The input stream containing the PowerPoint document
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(POIFSFileSystem fs)
                    throws java.io.IOException
Creates a PowerPointExtractor, from an open POIFSFileSystem

Parameters:
fs - the POIFSFileSystem containing the PowerPoint document
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(HSLFSlideShow ss)
                    throws java.io.IOException
Creates a PowerPointExtractor, from a HSLFSlideShow

Parameters:
ss - the HSLFSlideShow to extract text from
Throws:
java.io.IOException
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Basic extractor. Returns all the text, and optionally all the notes

Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Shuts down the underlying streams

Throws:
java.io.IOException

getText

public java.lang.String getText()
Fetches all the slide text from the slideshow, but not the notes


getNotes

public java.lang.String getNotes()
Fetches all the notes text from the slideshow, but not the slide text


getText

public java.lang.String getText(boolean getSlideText,
                                boolean getNoteText)
Fetches text from the slideshow, be it slide text or note text. Because the final block of text in a TextRun normally have their last \n stripped, we add it back

Parameters:
getSlideText - fetch slide text
getNoteText - fetch note text


Copyright 2007 The Apache Software Foundation or its licensors, as applicable.