Bean Sheet Data Formats

When performing operations on cells, every cell in a Bean Sheet document is thought of as containing a Java Bean object1. Such an object can be the result of a formula evaluation or it could simply occupy the cell in question. However, from the standpoint of data entry and display, as well as certain export-type persistence models, there needs to be a way to present non-textual cell contents and vice versa -- parse textual input to produce correct Java Beans. Bean Sheet relies on standard Java text format mechanism as it applies to objects. A user can apply a format to a cell or column range, which then governs the way objects are formatted and parsed. Anytime, a format is present and the formatting or parsing operation fails, Bean Sheet silently fails over to the default way of formatting and parsing cell data -- using object's toString() method on display and setting the value to a String object representing textual input on data entry. There are 3 types of formats that can be applied to a cell or column range: date / time, number, and message.

Date / Time Format2

Date and Time Patterns

Date and time formats are specified by date and time pattern strings. Within date and time pattern strings, unquoted letters from 'A' to 'Z' and from 'a' to 'z' are interpreted as pattern letters representing the components of a date or time string. Text can be quoted using single quotes (') to avoid interpretation. "''" represents a single quote. All other characters are not interpreted; they're simply copied into the output string during formatting or matched against the input string during parsing.

The following pattern letters are defined (all other characters from 'A' to 'Z' and from 'a' to 'z' are reserved):

Letter

Date or Time Component

Presentation

Examples

G

Era designator

Text

AD

y

Year

Year

1996; 96

M

Month in year

Month

July; Jul; 07

w

Week in year

Number

27

W

Week in month

Number

2

D

Day in year

Number

189

d

Day in month

Number

10

F

Day of week in month

Number

2

E

Day in week

Text

Tuesday; Tue

a

Am/pm marker

Text

PM

H

Hour in day (0-23)

Number

0

k

Hour in day (1-24)

Number

24

K

Hour in am/pm (0-11)

Number

0

h

Hour in am/pm (1-12)

Number

12

m

Minute in hour

Number

30

s

Second in minute

Number

55

S

Millisecond

Number

978

z

Time zone

General time zone

Pacific Standard Time; PST; GMT-08:00

Z

Time zone

RFC 822 time zone

-0800

Pattern letters are usually repeated, as their number determines the exact presentation:

SimpleDateFormat also supports localized date and time pattern strings. In these strings, the pattern letters described above may be replaced with other, locale dependent, pattern letters. SimpleDateFormat does not deal with the localization of text other than the pattern letters; that's up to the client of the class.

Examples

The following examples show how date and time patterns are interpreted in the U.S. locale. The given date and time are 2001-07-04 12:08:56 local time in the U.S. Pacific Time time zone.

Date and Time Pattern

Result

"yyyy.MM.dd G 'at' HH:mm:ss z"

2001.07.04 AD at 12:08:56 PDT

"EEE, MMM d, ''yy"

Wed, Jul 4, '01

"h:mm a"

12:08 PM

"hh 'o''clock' a, zzzz"

12 o'clock PM, Pacific Daylight Time

"K:mm a, z"

0:08 PM, PDT

"yyyyy.MMMMM.dd GGG hh:mm aaa"

02001.July.04 AD 12:08 PM

"EEE, d MMM yyyy HH:mm:ss Z"

Wed, 4 Jul 2001 12:08:56 -0700

"yyMMddHHmmssZ"

010704120856-0700

Number Format3

Patterns

DecimalFormat patterns have the following syntax:

 Pattern:
         PositivePattern
         PositivePattern ; NegativePattern
 PositivePattern:
         Prefixopt Number Suffixopt
 NegativePattern:
         Prefixopt Number Suffixopt
 Prefix:
         any Unicode characters except \uFFFE, \uFFFF, and special characters
 Suffix:
         any Unicode characters except \uFFFE, \uFFFF, and special characters
 Number:
         Integer Exponentopt
         Integer . Fraction Exponentopt
 Integer:
         MinimumInteger
         #
         # Integer
         # , Integer
 MinimumInteger:
         0
         0 MinimumInteger
         0 , MinimumInteger
 Fraction:
         MinimumFractionopt OptionalFractionopt
 MinimumFraction:
         0 MinimumFractionopt
 OptionalFraction:
         # OptionalFractionopt
 Exponent:
         E MinimumExponent
 MinimumExponent:
         0 MinimumExponentopt
 

A DecimalFormat pattern contains a positive and negative subpattern, for example, "#,##0.00;(#,##0.00)". Each subpattern has a prefix, numeric part, and suffix. The negative subpattern is optional; if absent, then the positive subpattern prefixed with the localized minus sign (code>'-' in most locales) is used as the negative subpattern. That is, "0.00" alone is equivalent to "0.00;-0.00". If there is an explicit negative subpattern, it serves only to specify the negative prefix and suffix; the number of digits, minimal digits, and other characteristics are all the same as the positive pattern. That means that "#,##0.0#;(#)" produces precisely the same behavior as "#,##0.0#;(#,##0.0#)".

The prefixes, suffixes, and various symbols used for infinity, digits, thousands separators, decimal separators, etc. may be set to arbitrary values, and they will appear properly during formatting. However, care must be taken that the symbols and strings do not conflict, or parsing will be unreliable. For example, either the positive and negative prefixes or the suffixes must be distinct for DecimalFormat.parse() to be able to distinguish positive from negative values. (If they are identical, then DecimalFormat will behave as if no negative subpattern was specified.) Another example is that the decimal separator and thousands separator should be distinct characters, or parsing will be impossible.

The grouping separator is commonly used for thousands, but in some countries it separates ten-thousands. The grouping size is a constant number of digits between the grouping characters, such as 3 for 100,000,000 or 4 for 1,0000,0000. If you supply a pattern with multiple grouping characters, the interval between the last one and the end of the integer is the one that is used. So "#,##,###,####" == "######,####" == "##,####,####".

Special Pattern Characters

Many characters in a pattern are taken literally; they are matched during parsing and output unchanged during formatting. Special characters, on the other hand, stand for other characters, strings, or classes of characters. They must be quoted, unless noted otherwise, if they are to appear in the prefix or suffix as literals.

The characters listed here are used in non-localized patterns. Localized patterns use the corresponding characters taken from this formatter's DecimalFormatSymbols object instead, and these characters lose their special status. Two exceptions are the currency sign and quote, which are not localized.

Symbol

Location

Localized?

Meaning

0

Number

Yes

Digit

#

Number

Yes

Digit, zero shows as absent

.

Number

Yes

Decimal separator or monetary decimal separator

-

Number

Yes

Minus sign

,

Number

Yes

Grouping separator

E

Number

Yes

Separates mantissa and exponent in scientific notation. Need not be quoted in prefix or suffix.

;

Subpattern boundary

Yes

Separates positive and negative subpatterns

%

Prefix or suffix

Yes

Multiply by 100 and show as percentage

\u2030

Prefix or suffix

Yes

Multiply by 1000 and show as per mille

¤ (\u00A4)

Prefix or suffix

No

Currency sign, replaced by currency symbol. If doubled, replaced by international currency symbol. If present in a pattern, the monetary decimal separator is used instead of the decimal separator.

'

Prefix or suffix

No

Used to quote special characters in a prefix or suffix, for example, "'#'#" formats 123 to "#123". To create a single quote itself, use two in a row: "# o''clock".

Scientific Notation

Numbers in scientific notation are expressed as the product of a mantissa and a power of ten, for example, 1234 can be expressed as 1.234 x 10^3. The mantissa is often in the range 1.0 <= x < 10.0, but it need not be. DecimalFormat can be instructed to format and parse scientific notation only via a pattern; there is currently no factory method that creates a scientific notation format. In a pattern, the exponent character immediately followed by one or more digit characters indicates scientific notation. Example: "0.###E0" formats the number 1234 as "1.234E3".

Rounding

DecimalFormat uses half-even rounding (see ROUND_HALF_EVEN) for formatting.

Digits

For formatting, DecimalFormat uses the ten consecutive characters starting with the localized zero digit defined in the DecimalFormatSymbols object as digits. For parsing, these digits as well as all Unicode decimal digits, as defined by Character.digit, are recognized.

Special Values

NaN is formatted as a single character, typically \uFFFD. This character is determined by the DecimalFormatSymbols object. This is the only value for which the prefixes and suffixes are not used.

Infinity is formatted as a single character, typically \u221E, with the positive or negative prefixes and suffixes applied. The infinity character is determined by the DecimalFormatSymbols object.

Negative zero ("-0") parses to Double(-0.0), unless isParseIntegerOnly() is true, in which case it parses to Long(0).

Message Format4

Patterns and Their Interpretation

MessageFormat uses patterns of the following form:

 MessageFormatPattern:
         String
         MessageFormatPattern FormatElement String

 FormatElement:
         { ArgumentIndex }
         { ArgumentIndex , FormatType }
         { ArgumentIndex , FormatType , FormatStyle }

 FormatType: one of 
         number date time choice

 FormatStyle:
         short
         medium
         long
         full
         integer
         currency
         percent
         SubformatPattern

 String:
         StringPartopt
         String StringPart

 StringPart:
         ''
         ' QuotedString '
         UnquotedString

 SubformatPattern:
         SubformatPatternPartopt
         SubformatPattern SubformatPatternPart

 SubFormatPatternPart:
         ' QuotedPattern '
         UnquotedPattern
 

Within a String, "''" represents a single quote. A QuotedString can contain arbitrary characters except single quotes; the surrounding single quotes are removed. An UnquotedString can contain arbitrary characters except single quotes and left curly brackets. Thus, a string that should result in the formatted message "'{0}'" can be written as "'''{'0}''" or "'''{0}'''".

Within a SubformatPattern, different rules apply. A QuotedPattern can contain arbitrary characters except single quotes; but the surrounding single quotes are not removed, so they may be interpreted by the subformat. For example, "{1,number,$'#',##}" will produce a number format with the pound-sign quoted, with a result such as: "$#31,45". An UnquotedPattern can contain arbitrary characters except single quotes, but curly braces within it must be balanced. For example, "ab {0} de" and "ab '}' de" are valid subformat patterns, but "ab {0'}' de" and "ab } de" are not.

The ArgumentIndex value is a non-negative integer written using the digits '0' through '9', and represents an index into the arguments array passed to the format methods or the result array returned by the parse methods.

The FormatType and FormatStyle values are used to create a Format instance for the format element. The following table shows how the values map to Format instances. Combinations not shown in the table are illegal. A SubformatPattern must be a valid pattern string for the Format subclass used.

Format Type

Format Style

Subformat Created

(none)

(none)

null

number

(none)

NumberFormat.getInstance(getLocale())

integer

NumberFormat.getIntegerInstance(getLocale())

currency

NumberFormat.getCurrencyInstance(getLocale())

percent

NumberFormat.getPercentInstance(getLocale())

SubformatPattern

new DecimalFormat(subformatPattern, new DecimalFormatSymbols(getLocale()))

date

(none)

DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())

short

DateFormat.getDateInstance(DateFormat.SHORT, getLocale())

medium

DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())

long

DateFormat.getDateInstance(DateFormat.LONG, getLocale())

full

DateFormat.getDateInstance(DateFormat.FULL, getLocale())

SubformatPattern

new SimpleDateFormat(subformatPattern, getLocale())

time

(none)

DateFormat.getTimeInstance(DateFormat.DEFAULT, getLocale())

short

DateFormat.getTimeInstance(DateFormat.SHORT, getLocale())

medium

DateFormat.getTimeInstance(DateFormat.DEFAULT, getLocale())

long

DateFormat.getTimeInstance(DateFormat.LONG, getLocale())

full

DateFormat.getTimeInstance(DateFormat.FULL, getLocale())

SubformatPattern

new SimpleDateFormat(subformatPattern, getLocale())

choice

SubformatPattern

new ChoiceFormat(subformatPattern)

Applying Formats To Existing Data

Formats are applied to cell and column ranges -- not to the objects they may contain. In the case where a set of cells contain textual data that needs to be parsed using a particular format, the following procedure is recommended:

  1. Select appropriate cell or column range.

  2. Select menu Table --> Column Format or Cell Format.

  3. Specify desired format and pattern and press OK button.

  4. If the format is being applied to a column range, select cells containing data that needs to be parsed. If there are gaps in the column range, select cells from consecutive column ranges and perform the following steps for each such selection.

  5. Select menu Edit --> Copy.

  6. Select menu Edit --> Paste.

1It has to be a Java Bean in order for most persistence mechanisms to work. Without worrying about persistence, any Java object could theoretically occupy a cell.

2http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html

3http://java.sun.com/j2se/1.4.2/docs/api/java/text/DecimalFormat.html

4http://java.sun.com/j2se/1.4.2/docs/api/java/text/MessageFormat.html