Most of the time we just verify the data type based on mapping document, but we need to think why they opted this data type and does it make sense or not in terms of business and technical perspective?
Normally the table structure will be defined by data model during the process of modeling. Sometimes they might map incorrect/inappropriate data type for a column, for an example , mapping varchar to a date column. It won’t affect the business standard but when you think of database level definitely it will have an impact.
How wrong data type selection impact?
It directly impacts in below ways,
- Data Integrity
The table data types as,
Column – Data type
DOB – Date
Updated_at – VARCHAR (20)
Now the requirement is, select the next date of DOB and Updated_at. The select query for DOB would be,
Select ID, Name, DOB+1 from table
The select query for Updated_at would be,
Select ID, Name, to_date (Updated_at, ‘DD-MON-YY’)+1 from table
From above query, the date addition for DOB happens directly but for Updated_at column first we have to convert into date then we need to add one day.
When you run both queries, we can see the performance difference, the query for Updated_at takes longer than DOB since it involves type casting operation.
Finally, what data type we are selecting also matter in terms of performance during data modeling.
The bytes of DATE type are 7, but the Updated_at column uses 10 bytes. It occupies extra space for each row, at the end, the block size will be high.
The database won’t allow any values other than Date for DATE type columns, but it allows any characters, number, and special characters for VARCHAR column. It ends with the loss of data integrity.
There are chances that the development handled this restriction in ETL jobs, but still database level it allows.
If the table is used in multiple jobs or SQL files, everywhere the cast need to be done. It would increase the script complexity.
Always it would be better to have a test case to ensure that the data type mapping for all columns.