Document Summary System
Skip Navigation Links
HOME
ABOUT US
PRODUCTS
DOWNLOAD
PARTNERS
CONTACT US
LINKS


Document Summarizer Document Summary System™ 


Frequently Asked Questions
Document Summary System - Standard Edition



6-1: What is a Project within the Document Summary System?

A Project is like a directory that is used to hold files. The Project is used for storing and managing one or more documents; along with their associated additional information saved during the document analysis. Multiple Projects may be created, each of which may contain multiple documents.
Return to Table Of Contents.

6-2: What is the use of the "Summarization Settings" dialog (Options >> Summarization Settings)?

The summarization settings dialog is shown below. The options allow you to control the summarization process.

The Summarization Settings are system wide and affect all projects as the project is loaded or populated. The summarization settings do not change unless manually reset by the user.
Controlling the Short Summary:
The size of the Short Summary is controlled by “Number of Sentences of Short Summary”. This option specifies the maximum number of sentences that will be included in the Short Summary. 
The predefined settings are:  3 (Short) / 7 (Normal) [Default] / 10 (Long).

You may also enter a value to select any other number of sentences.  If 20 were entered, there would be a maximum of 20 sentences in the short summary.

There may be fewer sentences in the summary than you requested if the document does not have enough information to provide the requested number of sentences.
Controlling the Full Summary:
The size of the Full Summary is controlled by “Compression Ratio of Full Summary”. This option specifies the size of the Full Summary in relationship with the size of the document being summarized.
The predefined settings are:   0.2 (Shorter) / 0.5 (Normal) [Default] / 0.7 (Longer).

You may enter a number between 0.00 and 1.00 in this field. The extremes are really not useful. 0.00 will normally produce one sentence. 1.00 will normally produce a large summary, approximately the same number of sentences as the original document.

If the original document is 1000 sentences long and the Compression Ratio setting is 0.1, then the Full Summary will have approximately 100 sentences.
Controlling the Summarization Content:
The content of the summary is controlled by “Summary Bias towards Keywords and Query (0 - 1)”. This option directs the document summary process consideration of queries and keywords when summarizing the text. The option extremes, 0 says to ignore the Queries and Keywords, and 1 causes the program to use only the Query and Keywords when selecting sentences. The Bias setting affects both the Short and Full Summaries.
The predefined settings are:   0 (Ignore) / 0.35 (Normal) [Default] / 0.65 (Heavy) / 1 (Completely).

You may enter any Bias value between 0 and 1. There may be little difference in the results between Bias values that are numerically close, for example, 0.35 and 0.40.

A Bias setting of 0 is useful if you have several documents with individual Queries or Keywords that you want to temporarily ignore. That is, you want document summaries without the Queries and Keywords, but you don't want to delete the existing Queries and Keywords. Without this option, you would have to modify the specifications for each document summary.
Controlling the Summarization Content:
The content of the summary is controlled by “Summary Bias towards Keywords and Query (0 - 1)”. This option directs the document summary process consideration of queries and keywords when summarizing the text. The option extremes, 0 says to ignore the Queries and Keywords, and 1 causes the program to use only the Query and Keywords when selecting sentences. The Bias setting affects both the Short and Full Summaries.
The predefined settings are:  0 (Ignore) / 0.35 (Normal) [Default] / 0.65 (Heavy) / 1 (Completely).

You may enter any Bias value between 0 and 1. There may be little difference in the results between Bias values that are numerically close, for example, 0.35 and 0.40.

A Bias setting of 0 is useful if you have several documents with individual Queries or Keywords that you want to temporarily ignore. That is, you want document summaries without the Queries and Keywords, but you don't want to delete the existing Queries and Keywords. Without this option, you would have to modify the specifications for each document summary.
Code Page selection:
This setting has more to do with the content of the Document when it is imported into the Document Summary System. The code page is for handling certain documents that have special characters that are not being recognized. This is normally for web page related documents. The normal setting is for “UTF-8”. If the imported web page has unusual characters, try changing the setting to “Unicode” and import the document again.
The predefined settings are:  UTF-8 [Default] / Unicode.

These are the only settings possible.
Automatically Fix Line Breaks:
This setting has its primary effect when a document is read into the Document Summary System. Normally this option should never be unchecked. The text is examined to combine separate lines of text into complete sentences, even though the original document may have had a carriage return and/or line feed in the middle of a sentence. The sentence structure takes precedence over imbedded control characters. This corrects many text files and converted PDF files so they may be summarized correctly.
The default setting is:  Checked [Default].

In a very unusual document is being imported that needs to consider the line breaks over the sentence structure, uncheck this option. Do not forget to recheck the option after importing the file.
Return to Table Of Contents.

6-3: What is the purpose of the Project Query and Project Keywords? (View >> Project Profile)

The Project Queries and Keywords that the user inputs on the Project Profile screen become global settings that apply to all documents in the Project.

For example, if the same questions apply to all documents in a Project, you can input the questions into the Project Profile just once rather than repeat them in each document in the Project.
Return to Table Of Contents.

6-4: What is the purpose of the Query and Keywords inputs on the Profile Tab for each document within a Project?

The Query and Keywords that you enter on the document's Profile Tab apply only to that document.
Return to Table Of Contents.

6-5: What happens when Queries and Keywords are entered on both the Project Profile and the Profile Tab?

Both sets of Queries and Keywords will be considered in summarizing a specific document.
Return to Table Of Contents.

6-6: What is the meaning of the Ranking?

The Ranking provides an indication of the relevance of this document to the Queries and Keywords that you have entered. The use of the Ranking should be to identify the relevance of one document to another within the project. The ranking indicated the presence of information relevant to the Query or Keywords within the document.
Return to Table Of Contents.

6-7: What are the values in the Status and what do they mean?

The Status field provides a view of what has been done for the document. The values and meaning of the status information is:
NewAdded The document has been added to the Project, but not yet summarized.
Summarized The document has been summarized.
Return to Table Of Contents.

6-8: What is the meaning of size?

The size field indicates the size in kilobytes of the information read into the Document Summary System. This is the size of the information that was created during the import process. It is not the number of characters and is not necessarily representative of the size of the text
Return to Table Of Contents.

6-9: How can the originally imported document be viewed?

Select the document you wish to view and click on “Open Original” on the toolbar. If you move, rename, or delete the file after import, the “Open Original” will not be able to display the file.
Return to Table Of Contents.