Difference between revisions of "Harmonized Data"

From dataZoa Wiki
Jump to: navigation, search
(Missing Values)
 
(29 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
<table>
 +
<tr>
 +
<td>
 +
<div class="Gdib" style="max-width: 20em;">
 +
dataZoa series are automatically <i>harmonized</i>, gracefully blending:
 +
<ul>
 +
<li>Source sites</li>
 +
<li>Frequencies</li>
 +
<li>Date formats</li>
 +
<li>Missing values</li>
 +
</ul>
 +
and eliminating the "data drudgery"; cleaning, aligning, pasting, re-typing and such.
 +
</div>
 +
</td>
 +
<td>
 +
<div class="imgWholeWrap1"><div class="imgWhole " data-guts="{ url: '/img/BannerMontage.png', title: 'Hundreds of authoritative sources', xtraStyle: ' min-height: 300px; min-width: 650px;'  }"></div></div></td>
 +
</tr>
 +
</table>
  
Today's Web is filled with Open Data - hundreds of authoritative, original data sources, freely accessible to everyone.
+
<div class="collapseTOC">__TOC__</div>
 
+
Before dataZoa, the actual practice of using Open Data sources was time-consuming, messy and painful. A major portion of time was wasted on "data drudgery"; cleaning, aligning, pasting, re-typing and such. All before any real thought or analysis could begin.
+
 
+
As you bring data into dataZoa, it is <i>harmonized</i> (or "normalized") so that any series can work with any other.  While we preserve all the original series documentation, we load the dates and values into an idealized time series.  This automatically handles the specific nuances of time series data:
+
  
 
=== Date Handling ===
 
=== Date Handling ===
  
 
==== Frequency/Periodicity ====
 
==== Frequency/Periodicity ====
 
+
<div>
http://docwiki.datazoa.com/Application_Conventions#Missing_Data_Values
+
{{Template:Periodicities}}
 +
</div>
  
 
==== Date Range ====
 
==== Date Range ====
 +
<div>
 +
{{Template:Date Range}}
 +
</div>
  
 
==== Gaps ====
 
==== Gaps ====
 +
<div>
 +
dataZoa handles data gaps automatically and gracefully for all regular periodicities.  If you want to prevent the natural implicit behaviors in calculations, charts, etc., you can use the "Irregular" periodicity.  If you want customized treatments of gaps, such as "carry forward" you can use the [[About_the_ComputeCloud|ComputeCloud]].
 +
</div>
  
 
==== Date Formats ====
 
==== Date Formats ====
 +
<div>
 +
</div>
  
 
===== Inputs =====
 
===== Inputs =====
 +
<div>
 +
{{Template:Date Input Formats}}
 +
</div>
  
 
===== Outputs =====
 
===== Outputs =====
 +
<div>
 +
Dates are formatted using a "human friendly" method appropriate to the periodicity of the data.  Depending on how they are being used, they may show in either compact or more verbose formatting.
 +
</div>
  
 
=== Data Value Handling ===
 
=== Data Value Handling ===
 +
<div>
 +
</div>
  
==== Missing Values ====
+
==== Missing Values Handling ====
 +
<div>
 +
Values can be "missing" for different reasons, with different implications.  dataZoa recognizes several different types of missing values and treats them appropriately.
  
See [[Application_Conventions#Missing_Data_Values|here]] for exact specifications of missing values formats.
+
{{Template:MissingValuesTable}}
 +
</div>
 +
 
 +
===== Calculations =====
 +
<div>
 +
In all calculations, missing data has the highest precedence; e.g. a number plus an NA yields an NA.  Specialized functions in the [[calculations|ComputeCloud]] can be used for specialized handling, such as carry-forward, etc.
 +
</div>
 +
 
 +
===== Tables =====
 +
<div>
 +
By default, <i>No Data</i> (ND) formats as a blank, while NA, and NDD data are formatted with "NA" and "NDD", respectively.  To fine tune these representations, see [[Displays]].
 +
</div>
 +
 
 +
===== Charts =====
 +
<div>
 +
Missing values are typically shown as gaps.
 +
</div>
  
 
==== Value Formatting ====
 
==== Value Formatting ====
 +
<div>
 +
Numeric values are formatted using a "human friendly" method that takes several factors into consideration; magnitude, sign, apparent significance, etc. 
 +
 +
When specific formats are required for [[Displays|displays]], options are generally available to control formatting at the display, row, column, and cell level, as appropriate.
 +
</div>
 +
 +
[[Category:Definitions]]

Latest revision as of 13:17, 27 April 2017

dataZoa series are automatically harmonized, gracefully blending:

  • Source sites
  • Frequencies
  • Date formats
  • Missing values

and eliminating the "data drudgery"; cleaning, aligning, pasting, re-typing and such.

Date Handling

Frequency/Periodicity

  • PERIODICITIES (FREQUENCIES):
    • Daily
    • Weekly
    • Monthly
    • Quarterly
    • Semiannual
    • Annual
    • Irregular

Date Range

  • EARLIEST DATE:
    • Because of the calendar discontinuity introduced at the Gregorian transition, the earliest year that can be stored accurately to the day is 1753.
    • Non accurate dates can begin as early as 1/1/1000.
  • LATEST DATE:
    • 12/31/9999

Gaps

dataZoa handles data gaps automatically and gracefully for all regular periodicities. If you want to prevent the natural implicit behaviors in calculations, charts, etc., you can use the "Irregular" periodicity. If you want customized treatments of gaps, such as "carry forward" you can use the ComputeCloud.

Date Formats

Inputs

Both American-style mm-dd-yyyy dates and International Style dd-mm-yyyy date formats are supported.

Supported date formats include:

Fully SpecifiedYear OnlyMonth, Year OnlySemi-AnnualQuarterly
01-15-2011
01-15-11
01/15/2011
01/15/11
15-01-2011
15-01-11
15/01/2011
15/01/11
15-Jan-11
15JAN2011
2011-01-15
20110115
Jan 15 2011
January 15, 2011
Sunday, January 15, 2011
110115
2011 1/11
1/2011
01-2011
2011m01
2011 Jan
2011Jan
Jan 2011
Jan2011
2011H1
2011h1
2011Q1
2011q1
Outputs

Dates are formatted using a "human friendly" method appropriate to the periodicity of the data. Depending on how they are being used, they may show in either compact or more verbose formatting.

Data Value Handling

Missing Values Handling

Values can be "missing" for different reasons, with different implications. dataZoa recognizes several different types of missing values and treats them appropriately.

Non-Numeric Type Meaning String to enter as Input String displayed as Output
No Data Data point is known not to exist ##ND## ND
Non-Disclosed Data Data point exists but is not being shown ##NDD## NDD
Not Available Data may exist but is not available ##NA## NA
Calculations

In all calculations, missing data has the highest precedence; e.g. a number plus an NA yields an NA. Specialized functions in the ComputeCloud can be used for specialized handling, such as carry-forward, etc.

Tables

By default, No Data (ND) formats as a blank, while NA, and NDD data are formatted with "NA" and "NDD", respectively. To fine tune these representations, see Displays.

Charts

Missing values are typically shown as gaps.

Value Formatting

Numeric values are formatted using a "human friendly" method that takes several factors into consideration; magnitude, sign, apparent significance, etc.

When specific formats are required for displays, options are generally available to control formatting at the display, row, column, and cell level, as appropriate.