ITS 122 Blog

Wednesday, December 9, 2009

assign_02(Current Trends)

December 8, 2009
Sybase ASE 15: Very Large Storage Systems
by Jeffrey Garbus

Executive Summary:
In this past year, the author has had multiple customers accelerate their migrations to ASE 15 with one goal in mind: taking advantage of Very Large Storage Structures (VLSS) in ASE 15 so that they can increase the size of their growing databases to something with an almost unlimited upside. The idea behind VLSS is that Sybase has modified the ASE system tables to accommodate the larger limitations. In this article old structure, new structure, and practical limitations are discussed.
Introduction
From its inception, Adaptive Server Enterprise (ASE) has been a high-volume Online Transaction Processing (OLTP) or Decision Support System (DSS) or mixed-use database, capable of storing, managing, and retrieving an enormous amount of data.

Over time, the definition of the word “enormous” has changed.

In the early 1990s, we designed and rolled out an application, which contained about 65 gigabytes of data. At the time, our contemporaries wondered if the system would even accommodate a table with 2.5 billon rows of data in a single table (it did). It was, at the time, the 4th largest ASE installation in the world. Anything over 20 gig was considered a Very Large Database (VLDB). Within a few months it had doubled in size, and a few months later doubled again to 250 gig, a monster in those days. The entire project was cost-justified in storage savings: a 2-gigabyte DASD device (for their DB2 applications) cost approximately $40K, while a 2 gigabyte SCSI drive cost about $2K.

Today, you can buy an external 1Terabyte hard drive for under $100. Times change.

In this past year, I have had multiple customers accelerate their migrations to ASE 15 with one goal in mind: taking advantage of Very Large Storage Structures (VLSS) in ASE 15 so that they can increase the size of their growing, 4 terabyte databases to something with an almost unlimited upside. (Yes, famous last words, but with a potential upside of 2 billion devices of 4 terabytes each, I’ll risk eating those words).

The idea behind VLSS is that Sybase has modified the ASE system tables to accommodate the larger limitations.

In this article, we’re going to discuss old structure, new structure, and practical limitations.
Pre-ASE 15 Storage

Prior to ASE 15.0, ASE was limited to 256 logical devices and 32GB for individual devices. Note that when these limits were originally put in place, 20 gig was thought to be a Very Large Database (VLDB), and it was years after before anyone attempted one. In addition, because of an intra-database addressing limitation, an individual database was limited to 4T (if you have a 2K page size) and 8T (if your page size is larger). These limitations were irrespective of physical device or OS-level limitations.

This was driven by the fact that ASE’s storage was organized by virtual pages using a 4 byte bit mask. The virtual page was the mechanism the server used to connect the logical concept of a page with the physical disk. The high order byte of the vdevno column in the sysdevices table was used to encode the device id, and consequently the byte of 8 bits limited ASE to 256 devices (8 bits as a signed integer). (Of course, with one device set aside for master, and perhaps others set aside for tempdb, sysaudits, etc., there were sometimes fewer than 256 available for data). The remaining 3 bytes were used to track the actual virtual page numbers – which considering a 2K page (the storage factor in sysdevices) provided a limit of 32GB per device. Note that theoretically, a 16K server could have large devices, but due the default of 2K and the implementation of sysdevices, a limit of 32GB was imposed.

---------------------------------------------------------------------

December 4, 2009

MS SQL

SQL Azure with ASP Dot Net

By Don Schlichting

Introduction

SQL Azure (SSDS - SQL Server Data Services) is a cloud database system offered by Microsoft. We interact with the SQL Azure services by either issuing statements to it though a command prompt or developing Dot Net applications. This article will introduce and demonstrate development of SQL Azure ASP Dot Net applications.

Connecting

Connecting to SQL Azure is similar to connecting to a traditional SQL Server. A connection string is created then opened. To begin with SQL Azure, create an account from the Microsoft Azure web site. Once the account is created, log in and navigate to the Server Administration page. Select the master database (master is automatically created for us when the Azure account is created), and then select “Connection Strings”. A popup will display the connection string for ADO Dot Net and ODBC. Copy the ADO Dot Net string. On the next tab, Firewall Settings, create a rule to allow your client machine access as shown below.

The IP address will be your outside connection address. The firewall tool will display the IP address you connected from. SQL Azure services are firewall protected, so each client requiring access will need a rule. If you're developing from a machine which receives a DHCP address from an internet service provider, then a new rule will need to be created each time a new address is received (unless you know the IP address range, then it can be specified here). Allow up to five minutes for the new firewall rule to go into effect.

Create a new Dot Net web site and aspx page from Visual Studio. In the Page Load section, add using statement s for System.Data and System.Data.SQLClient just as we would for a traditional SQL Server web page. Save the connection string as a variable and pass it to the SqlConnection object as shown below:

    protected void Page_Load(object sender, EventArgs e)
  {

      string sConn = "Server=tcp:qdpapn8.database.windows.net;
        Database=master;
        User ID=dons;
        Password=mypassword;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();
    
      objConn.Close();

  }

The connection string ID and password were specified when the Azure account was first created. The server section specifies the fully qualified name. In this example, qdpapn8, is my personal unique server that SQL Azure auto generated for my account. My personal server name is appended to the public “database.windows.net” domain to create the Fully Qualified Name (FQN). Note the “tcp” prefix to the server name. SQL Azure listens on TCP port 1433, so our client workstation must allow outbound traffic on this port. The Trusted Connection will always be set to False, as Windows Logins are not supported on SQL Azure. Executing this web page should result in a blank Internet Explorer window. If an error is thrown, it will be displayed there instead.

CREATE DATABASE

Now that we have a valid connection string, we’ll create a test database. The TSQL syntax for SQL Azure is usually identical to traditional SQL Server. A full list of supported statements, limitations, and options is available from this Microsoft web site: Transact SQL Reference SQL Azure Database. In this example, we’ll pass a CREATE DATABASE statement to a SqlCommand object as shown below. Remember to add the System.Data and System.Data.SQLClient using statements to all examples in this article.

    protected void Page_Load(object sender, EventArgs e)
  {

      string sConn = "Server=tcp:qdpapn8.database.windows.net;Database=master;User ID=dons;
  Password=mypwd;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();

      string sSQL = "CREATE DATABASE TestDB";

      SqlCommand objCmd = new SqlCommand(sSQL, objConn);
      objCmd.CommandType = System.Data.CommandType.Text;
      objCmd.ExecuteNonQuery();

      objConn.Close();
}

The CREATE DATABASE statement was saved as a string and then passed as a Text Command statement. After executing this page, a new 1 GB database was created as shown in the Server Administration web page.

Currently, there are two different database sizes, 1 GB or 10 GB, each with a different pricing policy.

CREATE TABLE

To create a Table in the new database, change the connection string specifying the new Database Name and change the statement to a typical CREATE TABLE as shown below. This example will create a two-column table.

   protected void Page_Load(object sender, EventArgs e)
  {
      string sConn = "Server=tcp:qdpapn8.database.windows.net;Database=TestDB;User ID=dons;
  Password=mypwd;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();

      string sSQL = "CREATE TABLE myTestTable
  ( FirstName varchar(50) NOT NULL, LastName varchar(50) NULL,
  CONSTRAINT PK_TestTable PRIMARY KEY (FirstName) )";

      SqlCommand objCmd = new SqlCommand(sSQL, objConn);
      objCmd.CommandType = System.Data.CommandType.Text;
      objCmd.ExecuteNonQuery();

      objConn.Close();
  }

Note that the CREATE TABLE statement included a key, in this case a PRIMARY KEY. Heap tables (tables without an index) are not allowed in SQL Azure.

INSERT

The syntax for inserting data into SQL Azure is identical to traditional SQL Server. Here we’ve modified the SQL string with an INSERT. The rest of the code, SQLConnection, SQLCommand, etc., are identical to the previous examples.

string sSQL = "INSERT INTO myTestTable (FirstName, LastName) VALUES ('bob', 'smith') ";

SELECT

To receive the newly inserted row back, drag a GridView control onto a new page and modify the SQL string to a SELECT statement. Then we’ll bind the GridView to the SQLCommand object and execute the Reader as shown below:

   protected void Page_Load(object sender, EventArgs e)
  {
      string sConn = "Server=tcp:qdpapn8.database.windows.net;Database=TestDB;User ID=dons;
  Password=mypwd;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();

      string sSQL = "SELECT * FROM myTestTable";

      SqlCommand objCmd = new SqlCommand(sSQL, objConn);
      objCmd.CommandType = System.Data.CommandType.Text;

      myGridView.DataSource = objCmd.ExecuteReader();
      myGridView.DataBind();

      objConn.Close();
  }

The result will be our one inserted row being returned as shown below:

Stored Procedures

In my opinion, one of the best features of SQL Azure is the ability to create Stored Procedures in the cloud. This is a rare feature, most cloud databases do not support stored procedures and instead database code must be passed as a string. By using Stored Procedures, we create a clear separation between database code and application code. In addition, it’s easy to share the same stored procedure between many web pages and at the same time we avoid SQL Injection attacks. The syntax for creating a Stored Procedure in the cloud is the same as traditional SQL Server. In this example, we’ll simply modify the SQL string as shown below:

    protected void Page_Load(object sender, EventArgs e)
  {

      string sConn = "Server=tcp:qdpapn8.database.windows.net;Database=TestDB;User ID=dons;
  Password=mypassword;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();

      string sSQL = "CREATE PROCEDURE GetNames AS SELECT * FROM myTestTable";

      SqlCommand objCmd = new SqlCommand(sSQL, objConn);
      objCmd.CommandType = System.Data.CommandType.Text;

      objCmd.ExecuteNonQuery();
      objConn.Close();

  }

To execute the Stored Procedure, we’ll drag a GridView onto the form; bind that grid to the SqlCommand object, and then call the ExecuteReader as shown below.

protected void Page_Load(object sender, EventArgs e)
  {
      string sConn = "Server=tcp:qdpapn8.database.windows.net;Database=TestDB;User ID=dons;
  Password=mypassword;Trusted_Connection=False";

      SqlConnection objConn = new SqlConnection(sConn);
      objConn.Open();

      SqlCommand objCmd = new SqlCommand("dbo.GetNames", objConn);
      objCmd.CommandType = System.Data.CommandType.StoredProcedure;

      myGridView.DataSource = objCmd.ExecuteReader();
      myGridView.DataBind();

      objConn.Close();
  }

Conclusion

Working with SQL Azure (SQL Server Data Services SSDS) leverages our existing TSQL and Dot Net skills. The TSQL for the cloud usually has the same syntax as a traditional SQL Server, and calling the cloud from a web page uses the same commands and readers as other familiar data sources. Having the ability to create relationships, use TSQL, and execute stored procedures in the cloud are strong benefits of SQL Azure.

Thursday, November 19, 2009

assign_01

Hierarchical data model is a data model in which the data is organized into a tree-like structure. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent. All attributes of a specific record are listed under an entity type.

Example of a Hierarchical Model.

In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column. Entity types are related to each other using 1: N mapping, also known as one-to-many relationships.

-------------------------------------*

Relational database matches data by using common characteristics found within the data set. The resulting groups of data are organized and are much easier for people to understand.

For example, a data set containing all the real-estate transactions in a town can be grouped by the year the transaction occurred; or it can be grouped by the sale price of the transaction; or it can be grouped by the buyer's last name; and so on.

Such a grouping uses the relational model (a technical term for this is schema). Hence, such a database is called a "relational database."

The software used to do this grouping is called a relational database management system. The term "relational database" often refers to this type of software.

Relational databases are currently the predominant choice in storing financial records, manufacturing and logistical information, personnel data and much more.

Sunday, August 16, 2009

My_Idea_Is

A. Discuss what you have learned and understood about what Relational Database Management is.

RDBMS, a type of database management system (DBMS) that stores data in the form of related tables. Relational databases are powerful because they require few assumptions about how data is related or how it will be extracted from the database. As a result, the same database can be viewed in many different ways.

B. Define how each of the following fit and function within the framework of Relational DBMS.

Key Fields
A key field is a field or set of fields of a database (typically a relational database) table which together form a unique identifier for a database record (a table entry). The aggregate of these fields is usually referred to simply as "the key".

Database Records
A row—also called a record or tuple—represents a single, implicitly structured data item in a table. In simple terms, a database table can be thought of as consisting of rows and columns or fields. Each row in a table represents a set of related data, and every row in the table has the same structure.
In database records,these are the data that you've just gathered and to be input into a table.

Data Queries
Allow us to view data in different ways. Queries are the primary mechanism for retrieving information from a database and consist of questions presented to the database in a predefined format.

Data Types
A set of value having predefined characteristics, in relational database, this is used to recognize the field if what type of data it is.

Data Forms
Allow us to build a user friendly interface for users.Forms allows you to create and run sets of linked, interactive screens for data entry, retrieval and update. A complete set of screens, known as a form, is defined using commands. There are commands for defining what variables are on each screen, how they are displayed and edited, how the screen is to look, and for linking screens together.

Tables/Database Files
These are the key field, description type etc.

Relationships(Table Linkage)
Is when you make a table then it will link to another table.

Thursday, July 2, 2009

MY_ASSIGNMNT

DATA TYPES
1. What are they?
a. A data type (or datatype) in programming languages is a set of values and the operations on those values.

2. What roles do they play in database?
a. They determine what kind of data will be processed or will become input.

3. 3 Database Management Systems Program
a. Visual FoxPro
Visual FoxPro is a data-centric object-oriented and procedural programming language produced by Microsoft. It is derived from FoxPro (originally known as FoxBASE) which was developed by Fox Software beginning in 1984.
b. SQL
SQL (Structured Query Language) (pronounced /ˌɛskjuːˈɛl/)[1] is a database computer language designed for managing data in relational database management systems (RDBMS). Its scope includes data query and update, schema creation and modification, and data access control. SQL was one of the first languages for Edgar F. Codd's relational model in his influential paper, "A Relational Model of Data for Large Shared Data Banks".[2] and became the most widely used language for relational databases[3][4]
c. FoxPro
FoxPro is a text-based procedurally-oriented programming language and DBMS, originally published by Fox Software and later by Microsoft, for MS-DOS, MS Windows, Apple Macintosh, and UNIX.

Kinds of Data Types and Description
a. integer : In more common parlance, whole number; a number that has no fractional part.
b. floating-point : A number with a decimal point. For example, 3 is an integer, but 3.5 is a floating-point number.
c. character (text ): Readable text

Thursday, June 25, 2009

MVvsdataF

MEMORY VARIABLE

The memory of a computer is organized into a regular pattern of containers for information. These containers for information are called "words". Each one has a numeric address and each one is the same size as each of the others. For most applications, it is inconvenient to refer to portions of memory by their numeric addresses, so programming languages allow us to allocate portions of memory by name. When we store information in the memory of a computer we need to decide on how much we need for various purposes and on how it will be organized. Programming languages provide mechanism for "types" of information in memory. They also provide mechanisms to identify repetitive arrays of items of the same type and to aggregate possibly heterogeneous types under a common name.

Conceptually the memory in a computer is very much like building lots laid out by lot number on a map of the streets of a city before any houses have been built.

It may be convenient to refer to buildings by names that help us remember who lives there. In using computer memory, can be helpful to refer to allocated portions of computer memory by names that help us remember the uses to which we are putting those portions of storage. In computing we call such names "variable" names.

A house has rooms into which we may cram only a certain amount of furniture. The words in a computer have a certain capacity, given in "bits", "nibbles", "characters", "bytes", "octets" "half-words" and other terms intended to convey a sense of smaller or larger portions of a word.

On modern computers, all the units of storage are effectively given in terms of "bits". A bit is a portion of memory which can hold either a "0" or a "1". The other terms mentioned above gone through historic changes and are likely to change again. Different computer manufacturers favored different numbers of bits in, say, a byte.At present, for the most popular desk-top computers and workstations, there is considerable agreement on the following sizes:

DATA FIELD

A data field is a place where you can store data. Commonly used to refer to a column in a database or a field in a data entry form or web form.

The field may contain data to be entered as well as data to be displayed.

A specific area of an electronic record allocated for a particular category of data, usually one data element, such as a name.

A data field is the smallest subdivision of the stored data that can be accessed. A data field can be used to store numerical information such as price, count or a date or time, or even a data and time. A pair of data fields can be used in combination to hold a geo-spatial coordinate. Also, a data field can be used to hold a block of text. A data field takes up permanent storage within the data-store.

The data-store is composed of a number of data records which are, in turn, composed of a number of predefined data fields. Each of these data fields must be defined within the Load Definition File with a unique name.

Data Types
A data field can be any one of four types:

Type	Description
Signed numeric data	Includes numeric values, bit masks, times and dates. May be filtered, sorted, displayed and used within formulas of all types.
String data	Consists of text blocks not exceeding 255 characters. May be sorted and displayed by Flexible Search.
Text data	Consists of text blocks of unlimited size. May only be displayed by Flexible Search, not sorted.
Attribute data	An array of boolean flags that can be turned on or off to set and check specific criteria. An attribute type can hold up to 1024 different attribute flags, each of which can be set to either true or false. However, space should only be set aside for the number of attributes required.

Temporary Fields
In addition, a numeric, string or text data field can be marked as temporary. Temporary fields are used for calculating values or indexing data. A temporary field is like a data field, but it does not take up any permanent storage within the data-store. It is used as a scratch location for holding temprorary data while it is being indexed or used within an expression to generate permanent data for a data field.

Monday, June 22, 2009

_TERMCONTRAST_

DATA vs. INFORMATION

Data is raw material & unorganized facts
that need to be processed
When data are processed, organized, structured or
presented in a given context so as to make them useful,
they are called Information.

Data are plain facts. The word "data" is plural for "datum."
When data are processed, organized, structured or presented in a
given context so as to make them useful, they are called Information.

It is not enough to have data (such as statistics on the economy).
Data themselves are fairly useless. But when these data are interpreted
and processed to determine its true meaning, they becomes useful and
can be called Information. Data is the computer's language.
Information is our translation of this language.

COMPUTER STORAGE vs. DATA STORAGE

Computer storage - an electronic memory device; "a memory and the CPU form
the central part of a computer to which peripherals are attached"
Computer data storage, often called storage or memory, refers to computer
components, devices, and recording media that retain digital data used
for computing for some interval of time. Computer data storage provides
one of the core functions of the modern computer, that of information
retention. It is one of the fundamental components of all modern computers,
and coupled with a central processing unit (CPU, a processor), implements
the basic computer model used since the 1940s.
In contemporary usage, memory usually refers to a form of semiconductor
storage known as random access memory (RAM) and sometimes other forms of
fast but temporary storage. Similarly, storage today more commonly refers
to mass storage - optical discs, forms of magnetic storage like hard disks,
and other types slower than RAM, but of a more permanent nature. Historically,
memory and storage were respectively called primary storage and secondary storage.
The contemporary distinctions are helpful, because they are also fundamental
to the architecture of computers in general. The distinctions also reflect
an important and significant technical difference between memory and mass
storage devices, which has been blurred by the historical usage of the term storage.
Nevertheless, this article uses the traditional nomenclature.

Data Storage
Data storage can refer to:

* Computer data storage; memory, components, devices and media that retain digital computer data used for computing for some interval of time.
* Any data storage device; that records (stores) or retrieves (reads) information (data) from any medium, including the medium itself.

OPERATING SYSTEM vs. COMPUTER SYSTEM

Operating system (commonly abbreviated to either OS or O/S) is an
interface between hardware
and user; it is responsible for the management and coordination
of activities and the sharing of the resources of the computer.
The operating system acts as a host for computing applications that are
run on the machine. As a host, one of the purposes of an operating system
is to handle the details of the operation of the hardware. This relieves
application programs from having to manage these details and makes it
easier to write applications. Almost all computers (including handheld
computers, desktop computers, supercomputers, video game consoles) as
well as some robots, domestic appliances (dishwashers, washing machines),
and portable media players use an operating system of some type. [1] Some of
the oldest models may however use an embedded operating system, that may be
contained on a compact disk or other data storage device.

Computer System:
A complete, working computer. The computer system includes not only
the computer, but also any software and peripheral devices that are
necessary to make the computer function. Every computer system, for example,
requires an operating system.

Wednesday, March 4, 2009

SEARCH

A. DEFINITIONS: -DATA STRUCTURE-

1.Data Structure is the structure that is formed and used by users or programmers to facilitate the processed datas that are needed to be stored. Making or providing structure depends on how or what certain kind of datas it would be stored to properly hold the datas in a memory. Memory is a part when we take data structures. With data structure, it would be convenient to work with datas in memory. Examples of structures are:stacks, linked-list, queues, arrays, graphs, trees, etc. It is also a way of storing data into the computer so that it would be used efficiently with the use of certain algorithms to operate it in the memory space.-personal definition...

2. Data structure in computer science is a way of storing data in a computer so that it can be used efficiently. It is an organization of mathematical and logical concepts of data. Often a carefully chosen data structure will allow the most efficient algorithm to be used. The choice of the data structure often begins from the choice of an abstract data type -http://en.wikipedia.org/wiki/Data_structure

3.An organization of information, usually in memory, for better algorithm efficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person. It may include redundant information, such as length of the list or number of nodes in a subtree.
-http://www.itl.nist.gov/div897/sqg/dads/HTML/datastructur.html

4. In computer science, an organization in software of data that allows more optimal searching, categorizing, or storage of information. Examples: matrix, stack, queue, dequeue, list, vector, scene graph.
-http://en.wiktionary.org/wiki/data_structure

5.A data structure is a specialized format for organizing and storing data. General data structure types include the array, the file, the record, the table, the tree, and so on. Any data structure is designed to organize data to suit a specific purpose so that it can be accessed and worked with in appropriate ways. In computer programming, a data structure may be selected or designed to store data for the purpose of working on it with various algorithms.
-http://searchsqlserver.techtarget.com/sDefinition/0,,sid87_gci804744,00.html

6. A means of storing a collection of data. Computer science is in part the study of methods for effectively using a computer to solve problems, or in other words, determining exactly the problem to be solved. This process entails (1) gaining an understanding of the problem; (2) translating vague descriptions, goals, and contradictory requests, and often unstated desires, into a precisely formulated conceptual solution; and (3) implementing the solution with a computer program. This solution typically consists of two parts: algorithms and data structures.
-http://www.answers.com/topic/data-structure

7. A data structure is a group of data elements grouped together under one name. These data elements, known as members, can have different types and different lengths.
http://www.cplusplus.com/doc/tutorial/structures.html

8. The way data is encoded by a computer. Data structure is not usually the concern of the user. However, in certain cases, e.g. when the user tries to read a text file produced by a different word processor he or she may become aware of the fact that similar screen display does not necessarily mean identical data structure for the computer.
-http://www.uni-duisburg-essen.de/CP/key_terms.htm

9. A way to store and organize data in order to facilitate access and modifications.
- www.azdoqit.com/LS4/Glossary%20of%20EHR%20Terms.doc

10. An organised aggregate of data items. -http://www.cs.ucc.ie/~dgb/courses/swd/glossary.html

------------------------------------------------------------------------------------------------------

B. 3 DATA STRUCTURES UNCOVERED

B.1 VLIST DATA STRUCTURES

-In computer science, the VList is a persistent data structure designed by Phil Bagwell in 2002 that combines the fast indexing of arrays with the easy extension of cons-based (or singly-linked) linked lists.^[1]

OPERATIONS TO ENTER ELEMENTS(DATA)

The primary operations of a VList are:

Locate the kth element (O(1) average, O(log n) worst-case)
Add an element to the front of the VList (O(1) average, with an occasional allocation)
Obtain a new array beginning at the second element of an old array (O(1))
Compute the length of the list (O(log n))

STRUCTURE

The underlying structure of a VList can be seen as a singly-linked list of arrays whose sizes decrease geometrically; in its simplest form, the first contains the first half of the elements in the list, the next the first half of the remainder, and so on. Each of these blocks stores some information such as its size and a pointer to the next.

The average constant-time indexing operation comes directly from this structure; given a random valid index, we simply observe the size of the blocks and follow pointers until we reach the one it should be in.

DATA RETRIEVAL

Any particular reference to a VList is actually a <base, offset> pair indicating the position of its first element in the data structure described above. The base part indicates which of the arrays its first element falls in, while the offset part indicates its index in that array. This makes it easy to "remove" an element from the front of the list; we simply increase the offset, or increase the base and set the offset to zero if the offset goes out of range. If a particular reference is the last to leave a block, the block will be garbage-collected if such facilities are available, or otherwise must be freed explicitly.

Because the lists are constructed incrementally, the first array in the array list may not contain twice as many values as the next one, although the rest do; this does not significantly impact indexing performance. We nevertheless allocate this much space for the first array, so that if we add more elements to the front of the list in the future we can simply add them to this list and update the size. If the array fills up, we create a new array, twice as large again as this one, and link it to the old first array.

B.2 HASH TABLE

-In computer science, a hash table, or a hash map, is a data structure that associates keys with values.

THE STRUCTURE

The primary operation that hash functions support efficiently is a lookup: given a key (e.g., a person's name), find the corresponding value (e.g., that person's telephone number). It works by transforming the key using a hash function into a hash, a number that is used as an index in an array to locate the desired location ("bucket") where the values should be.

Hash tables support the efficient lookup, insertion and deletion of elements in constant time on average (O(1)) that does not vary with the number of elements stored in the table; although may vary somewhat depending on how full the table is.

BASIC OPERATION

A hash table works by transforming the key using a hash function into a hash, a number that is used as an index in an array to locate the desired location ("bucket") where the values should be. The number is normally converted into the index by taking a modulo operation, or sometimes bit masking is used where the array size is a power of two. The optimal hash function for any given use of a hash table can vary widely, however, depending on the nature of the key.

It is also possible to create a hash table statically where, for example, there is a fairly limited fixed set of input values - such as the values representable by a single byte (or possibly two bytes ) from which an index can be constructed directly (see section below on creating hash tables). The hash table can also be used simultaneously for tests of validity on the values that are disallowed.

LOAD FACTOR: ACCOMODATE

An important property of a hash table is how "full" the table is, that is the ratio between the number of entries n and the size s of the hash table (i.e. the size of the array it uses to store values). The quotient n/s is therefore called the load factor of the hash table.

Most hash table implementations only perform well if the load factor is kept in acertain range.

RETRIEVAL

Hash tables store data in pseudo-random locations, so accessing the data in a sorted manner is a very time consuming operation. Other data structures such as self-balancing binary search trees generally operate more slowly (since their lookup time is O(log n)) and are rather more complex to implement than hash tables but maintain a sorted data structure at all times.

B.3 HEAP DATA STRUCTURE

THE STRUCTURE
-In computer science, a heap is a specialized tree-based data structure that satisfies the heap property: if B is a child node of A, then key(A) ≥ key(B). This implies that an element with the greatest key is always in the root node, and so such a heap is sometimes called a max heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min heap.) This is why heaps are used to implement priority queues. The efficiency of heap operations is crucial in several graph algorithms.

OPERATION TO GET CONTENTS ON THE NODES:

The operations commonly performed with a heap are

delete-max or delete-min: removing the root node of a max- or min-heap, respectively
increase-key or decrease-key: updating a key within a max- or min-heap, respectively
insert: adding a new key to the heap
merge: joining two heaps to form a valid new heap containing all the elements of both.

Comparison of theoretic bounds for variants

The following complexities^[1] are worst-case for binary and binomial heaps and amortized complexity for Fibonacci heap. O(f) gives asymptotic upper bound and Θ(f) is asymptotically tight bound (see Big O notation). Function names assume a min-heap.

Operation	Binary	Binomial	Fibonacci
createHeap	Θ(1)	Θ(1)	Θ(1)
findMin	Θ(1)	Θ(lg n) or Θ(1)	Θ(1)
deleteMin	Θ(lg n)	Θlg n)	O(lg n)
insert	Θ(lg n)	O(lg n)	Θ(1)
decreaseKey	Θ(lg n)	Θ(lg n)	Θ(1)
merge	Θ(n)	O(lg n)	Θ(1)

For pairing heaps the insert and merge operations are conjectured^{[citation needed]} to be O(1) amortized complexity but this has not yet been proven. decreaseKey is not O(1) amortized complexity [1] [2]

HEAP RETRIEVAL

Heaps are a favorite data structures for many applications.

Heapsort: One of the best sorting methods being in-place and with no quadratic worst-case scenarios.
Selection algorithms: Finding the min, max or both of them, median or even any k-th element in sublinear time^{[citation needed]} can be done dynamically with heaps.
Graph algorithms: By using heaps as internal traversal data structures, run time will be reduced by an order of polynomial. Examples of such problems are Prim's minimal spanning tree algorithm and Dijkstra's shortest path problem.

Interestingly, full and almost full binary heaps may be represented using an array alone. The first (or last) element will contain the root. The next two elements of the array contain its children. The next four contain the four children of the two child nodes, etc. Thus the children of the node at position n would be at positions 2n and 2n+1 in a one-based array, or 2n+1 and 2n+2 in a zero-based array. Balancing a heap is done by swapping elements which are out of order. As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.

One more advantage of heaps over trees in some applications is that construction of heaps can be done in linear time using Tarjan's algorithm.

HEAP IMPLEMENTATION

The C++ Standard Template Library provides the make_heap, push_heap and pop_heap algorithms for binary heaps, which operate on arbitrary random access iterators. It treats the iterators as a reference to an array, and uses the array-to-heap conversion detailed above.