May 1996
User Characterization Team, EOSDIS Core System, Landover, Maryland
The ECS User Characterization Information Catalog describes the most current ECS user pull information available. Information from and about the "User Pull" portions of the Technical Baseline, and abstracts of specific development questions which User Characterization has answered make up the contents of the catalog. We developed this catalog specifically for the ECS developers. By browsing and reading this catalog you can find out what information is currently available as well as what types of questions your co-workers have been asking. You may find the precise information which you need or related information which could be useful, or you may want to submit a request for information from User Characterization. (A request form can be obtained by contacting Joe Miller or Tess Wingo.) We will continue to add new information and abstracts to this catalog so that developers can remain up to date on what information is available for them and what information is being used by their co-workers.
Information Catalog contacts:
The User Pull Baseline describes the anticipated science user load on the EOSDIS Core System for five time frames: April 1997 (as the migration of Version 0 data sets is beginning), April 1998 (when TRMM products are available), April 1999 (when AM-1, Landsat 7, and COLOR products have been added to the system), June 1999 (includes the addition of ALT RADAR and SWS data products), and January 2000. Information in this baseline includes the anticipated number of users, number of system accesses, anticipated volume of data distributed to the users, total archive volume.
The DAAC Pull Baseline provides an estimation of user load on each of the DAACs. Information in this baseline spreadsheet is organized by DAAC. For each DAAC in the 5 time frames, estimates are provided for the total DAAC archive volume (in TB), volume of distributed data, rate of data archiving (GB/day), number of users per year with a 'high' and 'low' estimate, and number of DAAC accesses per year with a 'high' and 'low' estimate. Values in each of these information categories are totaled across all the DAACs for each time frame. Back to Table of Contents
The results were derived by first projecting the future volume of traffic from all user communities for each of the DAACs and estimating the percentage of this traffic that is due to system use by the earth science community. Some of these projections originated from the DAAC personnel, while others are estimates based upon the experience within the ECS project team.
The resulting projections include the number of science users that can be anticipated, the annual volume of data that will be distributed to these users, the number of system accesses that can be anticipated and the resulting size of the archive for Level 1 through Level 4 data products.
This baseline describes U.S. and foreign science users only. The total market size is estimated at 11,000-18,000 people with the EOSDIS ultimate share equaling 11,000- 17,000 people. It is assumed that the EOSDIS 11,000-17,000 user-market share will have access to the necessary hardware, software and communication capacity to make use of all available ECS-provided user services. Back to Table of Contents
The rate and volume of data to be migrated from Version 0 is based upon the "V0 Migration Baseline 1/19/96". The distribution volume for V0 products is assumed to be 2x the volume migrated into ECS in each time frame. The volume of RADARSAT, ERS 1/2, and JERS 1 data is based upon technical discussions with Ruth Duerr and other ASF personnel. Back to Table of Contents
For AM-1, TRMM, COLOR, ALT RADAR, and ADEOS II, the volume of the data distributed is assumed to be 2X the production volume. The only exception to the 2x production volume distribution estimate: Landsat 7 distribution: 50 GB/day, 365 days/yr.
AM-1, TRMM, ALT RADAR, ADEOS II, and Landsat 7 archive and production data volumes for each DAAC are based upon the AHWGP work provided by Dave Case 8/22/95; this volume includes the 10.2 GB/day for COLOR which was not included in AHWGP. Estimates do not include ACRIMSAT (impact is considered negligible).
The mission launch dates assumed are as follows:
This data volume is the total of 2x the volume of the V0 Migrated data and 2x the production volume of the EOS Mission data.
ASF distribution numbers include the high resolution ERS 1/2, JERS 1 and RADARSAT products that will be distributed from temporary storage (not archived) and produced ad hoc (ASF technical discussions).
It is assumed that any pull load placed upon the system due to the requirement to support users in the development of new search techniques (F&PRS Requirement SPDS0093) is included in these user pull baseline numbers. Therefore no additional load for system access and data distribution will be assumed for this requirement (i.e. it is within the 2X production volume distribution estimate). User pull on products for the purpose of product Quality Assurance should be assumed to be part of the 2X production volume distribution.Back to Table of Contents
The DAAC profiles use the same sources and assumptions that are detailed in section 1.2, however, the information is segmented by DAAC. The numerical values presented correspond to the totals presented in the User Pull Baseline.
The following categories of information are presented for each DAAC:
The following categories of information are presented and represent the sum of all the individual DAAC values:
Please note that the questions are referenced at the end of each abstract with a unique, permanent question number. Contact Joe Miller (x0802) or Tess Wingo (x0814) for a soft or hard copy of any of the detailed questions and answers.
Frequently in the descriptions of how particular development questions were analyzed and answered, mention is made of science user scenarios and science user demographics. These two tools were developed to aid User Characterization in understanding the science user community and to provide relevant development information to engineers. In the context of the User Characterization activities a science user scenario is a step-by- step description of the actions an ECS user takes in peforming research or other work. Each of the scenarios were developed via interviews with actual EOS-funded and other earth scientists. Scientists were asked about their research and how they envision using ECS to accomplish important tasks within their research. Development of the scenarios was a collaborative effort between User Characterization and the scientist. Detailed analysis of the scenarios and assignment of requirements to various functional categories helps in the determination of the impact a scenario may have on the evolving ECS architecture.
Science user scenarios are classified according to the manner in which both the system and data are accessed by the user. The classification takes the form of a scenario matrix, where the columns of the matrix represent the scale of the user's endeavor, and the rows of the matrix represent the ways in which a user might gain access to the system. For a full treatment of the science user scenario method and process see the white paper, ECS User Characterization Methodology and Results. 194-00313TPW.
Science user demographics were estimated from a variety of sources. Estimates of the overall earth science community were made by examining membership information for 5 major professional societies: American Geophysical Union, IEEE Geoscience and Remote Sensing, American Society of Agronomy, American Meteorological Society, and the Geological Society of America. For each professional society an estimate of the actively publishing proportion of members was derived by counting the number of authors in one year of each society's main publication and dividing this number by the total membership in that society.
In order to obtain demographic estimates for each type of scenario, current investigations appearing in society journals were categorized according to where within the science user matrix they would occur. This was accomplished by doing a rapid survey of articles published in one year of the following 5 professional journals: JGR Atmospheres, JGR Oceans, AGU Water Resources Research, IEEE Geoscience and Remote Sensing, and the International Journal of Remote Sensing. From this rapid literature survey a count of authors was made for each of the scenario matrix cells, from which proportions for each journal examined were derived. Matrix cell counts for all the journals examined were summed to arrive at a total count of active researchers occurring within each matrix cell. From these values estimates of the proportion of researchers from the entire earth science community per matrix cell were derived.Back to Table of Contents
Another resource of User Characterization which is referenced in some of the descriptions below is the EOSDIS Product Use Survey. In April 1995 a World Wide Web- based survey of earth scientists was released with the intention of capturing the earth science community's needs for accessing, browsing, and ordering data products which will become available through EOSDIS during the period 1998-2000.
The categories of information collected from the earth scientists are as follows:
Survey results were imported to a FoxPro database, and are used in support of development question analysis.Back to Table of Contents
Originator of the request: Tom Dopplick (Science Office)
This information was requested in order to achieve a detailed understanding of the types of search requests of science users of ECS. The information was gathered from the science user scenarios and is in tabular form with search requests sorted according by complexity. Columns in the table include: scenario ID number, scenario summary (title), scenario candidate (person), step number (in which the request occurs), user service invoked, and system subservice invoked. (#48 Nov. 23, 1994 )Back to Table of Contents
Originator of the request: John Ujhazy (IMS).
The science user scenario spreadsheet was used to perform the analysis. The following categories of information were used from the spreadsheet: layer of the 'data pyramid' which was accessed, type of system subservice, data type, number of hits returned, maximum demographic estimate for 4 system epochs (early 1997, early 1998, early 1999, mid 1999). Tables for each epoch were created which described the number of hits (applying demographics) returning to the user from the metadata layers of the data pyramid for each type of search, and whether it was a single or multi-site search. Warning: multiple site queries may not be possible in early 1997, however, in the other system EPOCHs described this capability will exist. (#51 February 1, 1995 )Back to Table of Contents
Originator of the request: John Farley (IMS).
The science scenario spreadsheet was used to estimate and categorize user queries. Twelve separate categories were denoted using 3 different attributes: (1) type of search, (2) one site or multi-site, and (3) upper (metadata) or lower (level data) portion of the data pyramid. The maximum demographic estimate for EPOCH 4 (mid 1999) was applied to the yearly frequency of the request (derived from the science user scenarios). (#23a November 18, 1994 )Back to Table of Contents
Originator of the request: John Farley (IMS).
This question and answer developed as an outgrowth of 2.2.3. The data pyramid layer categories used in this question were: Directory/Guide/Bibliographic References (considered together), Inventory, QA Statistics, Summary Statistics, and Algorithms. The different search (query) types used were: simple, match-up, and coincident searches. Queries could either be multi-site or one site. For each system epoch the query occurrences in the science user scenarios were multiplied by the average number of times it occurred per year and the number of people associated with the particular scenario for that time period. (#23b January 9, 1995 )Back to Table of Contents
Originator of the request: Hassan Rifky (CIDM)
This information was needed in order to see if the current data dictionary database will be able to handle these queries. The ECS User Characterization scenario database was used to extract a full profile of types of user queries. Information included text descriptions of the query and the system subservice category of the query. There were a total of 110 query examples. (#81, November 27,1995)Back to Table of Contents
Originator of the request: Jan Dreisbach (MRS)
This information was desired in order to test the Illustra database software with the types of complex queries scientists will submit to the system in the Release B timeframe. The queries include detailed information about: temporal and spatial coverage, data parameters, and data products. (#83, January 24, 1996)Back to Table of Contents
Originator of the request: Richard Hunter (MRS)
The number of searches per year for all science users was computed with the science user scenarios and demographics. This number was divided by 250 days to obtain a daily total which was distributed over the 24 time zones according to the number of users expected in each time zone. The number of searches in each individual time zone was distributed over a 24-hour period according to the time of day curve, summed (preserving time differences), and referenced to Time Zone 0. Time of day variation in searches was distributed among the DAACs according to Release B RDAFs (Relative DAAC Access Frequencies), resulting in time of day curves for each DAAC time-referenced to Time Zone 0. Each DAAC curve was adjusted to reference the data to local time of day,.and the number of searches per 30 minutes were divided by 30 to obtain an average number of searches per minute during each 30-minute period. (#84, February 9, 1996)Back to Table of Contents
Originator of the request: Sidarth Ambardar (CSMS)
This information was desired in order to clarify the network characteristics which may be required to meet user expectations. In this case, the developer needed to know how quickly a user would want browse images delivered to their system. The science user scenarios were used to derive this information. Each occurrence and length of time desired for browse delivery was recorded. The demographics associated with each scenario (for epoch 4, mid 1999) were also recorded. Browse delivery times (as specified by the science users) were sorted into 4 classes: 0 - 1 minute, 1 - 2 minutes, 2 - 5 minutes, and > 5 minutes. The browse delivery class information was multiplied by the associated demographics to arrive at an estimate for the number of people expecting delivery in each time class. (#4 November 8,1994 )Back to Table of Contents
Originator of the request: Alla Lake (DADS)
The EOSDIS Product Use Survey data was examined to count the number of occurrences of browse selection in different frequency categories (Annually 1-2/yr, Quarterly 3-10/yr, Monthly 11-24/yr, Weekly 25-100/yr, Daily >100/yr, Rarely <1/yr). Proportions of total browse selections were calculated for each frequency category. This information is only valid for Release B and the following conditions and assumptions apply : 1. browse products will be provided by the instrument teams (producers of the data) 2. the approximate volume of browse products will be 1 MB 3. 288 survey responses generated a total of count of 5694 browse selections. (#19, June 6, 1995)Back to Table of Contents
Originator of the request: Champa Bhushan (Performance Model)
The original need for this data was as input to the Performance Model. The information used to derive this information came from the 'User Pull Technical Baseline' (number of science users), and the science user scenarios (detailed stepwise descriptions of how scientists would interact with the system - - collected from actual scientists). The demographic information in the technical baseline was arrived at by using current data system usage statistics from the DAACs to project future system usage. Demographic analysis was performed in 1993 to determine the relative number of science users who would use the EOSDIS in a fashion similar to that described in the science user scenarios. Demographics information was applied to the science user scenario service invocation information to estimate the distribution of services over the course of a year. (#49December 8, 1994.)Back to Table of Contents
Originator of the request: DADS, Performance Modelers
Information used in developing a description of service demand came from the science user scenarios, science user demographics, and time-of-day usage data for a host machine resident at Goddard Space Flight Center. Time periods described are: April 1997, April 1998, April 1999, and June 1999. Total service invocations per day for a 250 day work year were calculated, number of users in each of the 24 time zones were estimated, and the number of services was distributed across the 24 time zones. A normalized local time of day curve was applied to the number of service invocations per day per time zone. The number of service invocations per day per time zone was summed over all time zones referenced to 00 GMT taking care to preserve time differences. The result is the number of service invocations per 30-minute period for 48 instances of time during a 24-hour period. This result was shifted 5 hours to GSFC time. (#20 December 1, 1994)Back to Table of Contents
Originator of the request: Ron Williamson (SDPS)
The request was for a representative set of search, subset, and subsample queries that EOSDIS scientists would request. The purpose behind this question was to aid in determining the specific functionality supported by the Earth Science Query Language proposed for use in Release B. As this was simply an initial exploratory task, only a sample of between 20 and 30 queries was desired. One query was selected from each of the 27 science user scenarios, and the 27 examples were provided in an Excel spreadsheet. Care was taken to have at least one example of each type of query which occurs in the scenarios. (#79, December 15, 1995)Back to Table of Contents
Originator of the request: Mark Huber (DADS)
Original contractual requirements called for archive and distribution of subintervals of Enhanced Thematic Mapper (ETM) data. Requests were made by the customer that we archive subintervals and distribute scenes of ETM data. In order to provide resources with which to gauge the impact of this change the WWW EOSDIS Product Use Survey data was examined to determine the proportion of people doing research at different spatial scales who also indicated that they would want to order ETM data. It is unlikely that scientists researching at a spatial scale smaller than that of a subinterval will want a complete subinterval of data. Therefore by comparing proportions of people performing research at different spatial scales and comparing those scales to that of a subinterval, one can gain a better understanding as to what proportion of people will be likely to want scenes versus subintervals. (#55 June 5, 1995)Back to Table of Contents
Originator of the request: Mary Armstrong (Multi-Release support)
The concern this information helps to address is the management of network traffic by ECS and user expectations about electronic delivery of data. Would users of ECS be willing to wait until off-peak network hours to receive their data electronically, or wait until off-peak network hours before attempting anonymous ftp of data from a DAAC? Since there are no direct sources of quantitative information, current data center experience was collected and collated. (#54a June 9, 1995)Back to Table of Contents
Originator of the request: Andy Endal (Science Office)
This information was requested in order to help determine whether software should be written to handle large data deliveries; the system will need to break orders greater than 45 GB into smaller components. The information was provided via the ECS science user scenario spreadsheet. The volumes of all the products ordered in single steps within the scenarios were totalled to arrive at a volume for those particular orders. If that volume was greater than 5 GB or 45 GB the number of users associated with that scenario were multiplied by the number of deliveries per year for that order. (#63 October 20,1995)Back to Table of Contents
Originator of the request: Stuart Dogger (SCDO)
The ECS science user scenario database was used to calculate total orders handled by the system during the Release A and B timeframes. The information was desired to aid in the development of the Operations Concept End to End Order Tracking Scenario (in DID 305), and in preparation of resource planning estimates at the DAACs. (#68 October 17, 1995)Back to Table of Contents
(Number of orders per DAAC)
Originator of the request: Jorge DeDios (Management Subsystem MSS)
This information was desired by DeDios in order to understand how many transactions might need to have a bill or account processed. The 27 science user scenarios and the number of science users associated with each were analyzed for overall ordering activity. The total number of orders was then distributed across the DAACs according to the results of the EOSDIS Product Use Survey. This was done for the following timeframes: January 1997, January 1998, June 1998, January 1999, January 2000. (#76 October 27,1995) Back to Table of Contents
This question is also used as an update to the question from Stuart Dogger (SCDO) above.
Originator of the request: Tom Smith (DAS)
This information was desired in order to address issues about data ingestion and distribution which arose at the Release B workshop and CDR-A: devices to be supported, associated media form factors, modes of ingestion and distribution, and operational requirements to oversee media ingest and distribution. Some of the information provided (media and volume disstributed and requested) was derived from the Statistics Collection and Reporting System database. Information about current data compression practices at the DAACs was provided with the help of the DAAC Science Liaisons. (#74 October 16, 1995)Back to Table of Contents
Originator of the request: Nick Singer (Performance Modelling)
This information was desired in order to size archive and distribution hardware, including robotics, disk, and networks. The science user scenario database was used to add up volume sent to users via subscription (standing orders), and volume sent to users for non-standing orders (ad hoc requests). (#71 August 31, 1995) Back to Table of Contents
Originator of the request: Tom Smith (DSS)
The information request was for information on those products which routinely require segmentation, the media upon which these products are normally requested, and the frequency with which segmentation takes place per DAAC. The Statistics Collection and Reporting System (SCRS) which contains current DAAC site statistics was used to answer this question. The number of media was calculated by dividing the total volume delivered by the capacity of each medium. Capacities were obtained from Jim Closs (GSFC), and ten percent was subtracted from each category to account for tape overhead. (#78, December 17, 1995) Back to Table of Contents
Originator of the request: Roland Weiss (DSS)
Information from the ECS Product Use Survey, the ECS Technical Baseline, and the ECS User Characterization science user scenarios was used to answer this question. The equation Delvtg = (RDAF)(Propgv)(Delvy) was used where RDAF = the relative DAAC access frequency (by end users) as derived from the ECS Product Use Survey, Propgv = the proportion of products at a given DAAC which fall within a particular granule volume class, and Delvy = the number of deliveries per year of the particular granule volume class. Delvtg = the total number of deliveries per year of a particular granule volume class from a particular DAAC. The values for Delvtg were calculated for each step occurring in the science user scenarios and were categorized by the volume which was pulled from the archive to fulfill those requests. (#77, January 5, 1996)Back to Table of Contents
Originator of the request: Chin-Ho Lien (M & O) Release B
This information was needed for use in the operations workshop on Januray 17-19, 1996. The science user scenarios were used for estimating the values for electronic and media distribution. For each scenario step that involved distribution of level data, the number of deliveries associated with that step for the January 2000 timeframe were recorded. Based upon the mode of delivery the requests were sorted into electronic or media categories. (#80, January 3, 1996)
Originator of the request: Jim Closs (USWG)
The science user scenarios were used for estimating the required number of data requests per year through electronic and physical media.For each scenario step that involved distribution of Level data, the number of deliveries that went to a single user was multiplied by the maximum number users associated with that step for the Release A time frame (viz. Jan 1998; This is the maximum demographic estimate for Release A and includes standing order distributions.) Based on the medium of data delivery, the number of requests were split into electronic and media requests.
Originator of the request: Mary Armstrong (MRS)
The ECS Science User Scenario database was used to answer this question. For each step in the scenario database where granules were being distributed to a user, the number of granules were multiplied by the number of deliveries and the number of users (January 2000 demographics estimate for that step). In cases where nonEOS data was being delivered to users, the number of deliveries were multiplied by the estimated number of users (January 2000 timeframe). These figures were then divided by the number of days in a work year (250 days = 5 day work week with a 2 week vacation). (#89b, March 4, 1996)
Originator of the request: Mary Armstrong (End to End Modelling, MRS)
The User Characterization science user scenarios and their associated demographics were used to answer this question. For scenario steps in which the user requested an order ad hoc, requested the data "as available", and requested a subscriptions for data on a periodic basis the following information was collected: number of science users associated with that step, number of deliveries per year, and the number of granules per delivery. The information was presented in three different ways in order to faciliate estimation under different sets of assumptions. (# 97, 88, April 2, 1996)
Originator of the request: Denise Heller (Data Modelling), current contact: Rosemary Dakos
The original need for this information was to aid in data model development, more specifically the control (or dispersion) of demand placed upon data objects. The WWW EOSDIS Product Use Survey information was used to estimate relative demand for different levels of data at each of the DAACs. Since the survey only asked questions about data products and not about demand for metadata we could only provide estimates for level data, not the entire data pyramid. The information is provided in table form by product and sorted by DAAC at which the products will be stored. (#47 June 5, 1995)Back to Table of Contents
Originator of the request: Mary Armstrong (Multi-Release Support)
The ECS Product Use survey (administered via the World Wide Web) was used to estimate relative demand for products from science end-users. Information from the survey regarding the frequency with which survey-takers said they would order specified data products was used to compare the demand across products and the DAACs which will contain those products. These results are presented in tabular format with: product ID, product name, product level, DAAC at which it's resident, instrument, relative percentage of demand for the product, and relative product demand totals by DAAC. (#54 March 23, 1995)Back to Table of Contents
Originator of the request: Sonlihn Phuvan (DAS)
This information was needed to contribute to a data compression study being done by DAS for the purposes of network and data delivery planning. Because different datatypes have different compression ratios it was necessary to have an estimate for the relative demand for these different types. The following resources were used to estimate the relative demand for different data types (Grid, Point, Swath, Table, Native):
The above values were used to calculate the proportionate demand by volume for each of the datatypes. The product/datatype mapping was provided from spreadsheet (draft) developed by Graham Read. (#65 October 13, 1995)Back to Table of Contents
Originator of the request: Graham Vowles (SCDO)
This information was desired in order to size performance characteristics for the WWW COTS server. The ECS Science User Scenario Database was examined to answer the question. The number of time these data layers were accessed, the number of science users associated with each of those steps, and the number of deliveries per year for that data layer for that particular step, were recorded. Deliveries per year were multiplied by associated demographics to arrive at a total estimate for accesses per year. This value was distributed among the Release A DAACs using the Relative DAAC Access Frequency proportions derived from the ECS Product Use Survey. (#66, November 1, 1995)
Originator of the request: Mary Armstrong (End to End Modelling, MRS)
The User Characterization science user scenarios and their associated demographics were used to answer this question. All requests and searches for documents in the scenarios were multiplied by the number of users associated with that step. The steps were totalled to arrive at an estimate for the number of requests for documents per year, and divided by 250 for a per day estimate for a 250 day work year, and 365 for a per day estimate for a 365 day year. (#103, March 20, 1996)
Originator of the request: Sidarth Ambardar (Multi-Release Support - H/W & Networks Eng.)
This type of information was desired to aid in network planning between DAACs. The information was generated using three basic sources: science user scenarios, number of science users, and the ECS Product Use survey. Service invocations which will cause interDAAC traffic to occur were counted from the science user scenarios. The number of science users was taken from the ECS User Pull Technical Baseline (6/95). Proportions of this population which would use the system in a manner similar to the scenarios were based upon a 1993 demographic study. The ECS Product Use survey results were used to derive relative DAAC access frequencies by totaling the relative product access frequencies from the survey for all the products resident at each DAAC. The probability of entering the system at each DAAC was multiplied by the probabilities of accessing other DAACs from the system access DAAC. This value was then multiplied by volume of user queries going out to the DAACs and volume of results coming back from the DAACs. (#8&9 July 14, 1995)Back to Table of Contents
Originator of the request: Mary Armstrong, Sidarth Ambardar (Multi-Release Support, CSS)
The purpose of gathering this information was for use in external network sizing for pull users of EOSDIS. The annual volume of InterDAAC traffic was computed from the science user scenarios with their associated demographics and the relative product access frequency data derived from the ECS Product Use Survey. The average daily volume of InterDAAC traffic was calculated by dividing the total annual estimated volume by 250 days (one work year). The existing service invocation rate information (section 2.4.2) was applied to the volume of InterDAAC traffic for Early 1999 and Mid 1999 (these time periods are expected to have the highest number of users of the 4 system epochs). (#53 Aug 23,1995)Back to Table of Contents
Originator of the request: Janet Hylton (MRS -- Data Modelling)
The science user scenarios were used to count the subsystem service accesses. For each step in the scenarios, if the results were sent to the user the total number of deliveries for the step request were multiplied by the associated demographics for the required time frame.This was performed for all relevant scenario steps and the results were totalled for each subsystem service pyramid level combination which occurred within the scenarios. The results were provided for both Release A and B timeframes. (# 62 October 20, 1995)Back to Table of Contents
Originator of the request: Mary Armstrong (End to End Modelling, MRS)
The User Characterization science user scenarios were used to obtain a total number of service invocations per year for each of the 65 system subservices. Of the 65 subservices 28 are currently mapped to the Data Server Subsystem (DSS). The yearly number of invocations for each of these 28 subservices was totalled then divided by 250 days. The MRS developers used this number and made the assumption that 5% of the DSS sessions will be suspended per day. (#92, March 11, 1996)Back to Table of Contents
Originator of the request: George Mellis (MRS)
The User Characterization science user scenarios and their associated demographics were used to answer this question. The subservices which are not solely mapped to the client subsystem (34) were totalled and divided by 250 days (1 work year) to provide a daily rate. The yearly total was also divided by 365 days since many users will work on weekends and holidays. The daily averages were then distributed among the 6 DAACs according to the relative interest in the products archived at each DAAC (see doc# 161-TP-001-001). Note: the mapping of subservices to subsystems was performed by Randall Miller in the MRS segment. (#91, March 5, 1996)Back to Table of Contents
Originator of the request: Mary Armstrong (End to End Modelling, MRS)
The User Characterization science user scenarios and their associated January 2000 demographics were used to answer this question. Each instance in the scenarios when an ingest of data or metadata occurred was recorded and the number of deliveries and number of users were multiplied. All of these occurrences were totalled to arrive at the total number of metadata ingests per year. (#98, March 20, 1996)
Originator of the request: Tom Hickey (M&O), Richard Hunter (MRS)
This information is based in part upon the User Pull Technical Baseline 2/5/96. The logon proportion for each DAAC was computed from the baseline information. The total number of accesses were reduced by 38% to account for users who today must access each DAAC separately, but who, in the future, will be able to access one DAAC from another during the same session. The number of accesses were divided by 250 to obtain the number per day, and these were distributed across the time zones according to the global distribution of scientists. The time of day usage curve (see document 160-WK-001.001, ECS User Model Inputs to System Performance Model: Methodology and Results) was applied to each individual time zone, and the 24 zones were summed and referenced to time zone 0. The time of day usage curve was then distributed amongst the DAACs according to the proportions as calculated.(# 60 February 21, 1996)Back to Table of Contents
Originator of the Request: Mary Armstrong (MRS -- End to End Modelling)
The statistics gathered by the Global Change Master Directory (GCMD) were used as a means of describing the present day accesses to a service similar to what the Advertising Service is envisioned to be. These statistics cover HTML requests, malformed requests, and total requests. Average daily rates were derived from these monthly totals, and are presented in tabular and graphical form. (#100b, March 27,1996)Back to Table of Contents
Originator of the Request: Mary Armstrong (MRS -- End to End Modelling)
Since the Global Change Master Directory (GCMD) provides information, and services similar to what is envisioned for the ECS Advertising Service the statistics (July 1994 to February 1996) gathered by the GCMD were used to answer this question. An average daily rate was derived from the statistics (which had a monthly resolution). Results are presented in tabular and graphical form. (#94,95,96, March 27,1996)Back to Table of Contents
Originators of the request: Huber and Lake (DAS)
This information was desired by developers to aid in data server design. The information had implications for data server "workspace", relative demand for subsetting, and any potential bottlenecks in the order fulfillment process.. For each instance in the science user scenarios where data subsetting was required, the volume which was operated upon was recorded. Assumptions were made about the sequence in which subsetting was performed: 1.) Parametric 2.) Spatial 3.) Gridding 4.) Temporal 5.) Scaling for browse data. The volume operated upon came from the prior subsetting operation. The number of times a request was delivered per year was multiplied by the number of users estimated for that scenario. Requests were sorted by the volume of the request and the number of granules for each request. (#16 and #18 December 15, 1994)Back to Table of Contents
Originators of the request: Huber and Lake (DAS)
The purposeof the information was to describe the volume and number of granules accessed per request in order to plan access, storage, distribution techniques and media types. the science user scenarios were used to calculate the granules per request, data volume per request, and the number of granules coming from the production stream (standing orders). Values were calculated by multiplying by the number of times the data was delivered per year, and the number of users associated with the scenario step. Values for steps are shown individually, with demographics in order to allow the DADS team to perform trade studies for media and distribution. (#13 November 18, 1994)Back to Table of Contents
Originator of the request: Dean Moore (CSMS)
The purpose for collecting this information was to assist CSMS in making estimates for the required bandwidth for connections outside of each DAAC's LAN. The science user scenario 'results volume per year', 'number of deliveries per year', and 'maximum demographic estimates' for the 4 epochs (early 1997, early 1998, early 1999, and mid 1999) were used to answer this question. Number of deliveries per year were multiplied by the maximum demographic estimate for each epoch to arrive at total deliveries per year. Results volume per year was multiplied by the maximum demographic estimate for each epoch to arrive at total volume delivered per year. (#50 February 2, 1995)Back to Table of Contents
Joseph Miller, ECS Science Office
x0802 / jmiller@eos.hitc.com
Revised: May 6, 1996
URL: http://newsroom.gsfc.nasa.gov/user_char/prot/UCTIC.html