Performance while fetching data from large record set
Hi All,
Well I wasn't able to decide where to post this, since this is a table design issue, but since I would be using JDBC for this hence decided to take your opinions here.
I need a suggestion from you all. I have a table designed as Class, Student, Day, Attendance. there could be 50 maximum class and each class could have a maximum of 50 students. I have to mark the Attendance field as Y or N based on the Day(Mon-Fri).
Presently My table data is something like this:-
Class|Student|Day|Attendance
11MY
11TuY
......
21MN
21TuY
.....
and so on for each Class and Student.
What can be seen is that this would result in a huge data. I am planning to implement the table by increasing the column as introducing Mon - Fri and removing the existing Day and Attendance columns, and then for each Class, Student combination, I would mark Y or N in the Mon - Fri columns. Hence my table data becomes like this:-
Class|Student|Mon|Tue|.....
11YY
12NY
........
21YN
.....
This design leads to adding more columns as compared to the existing one, but leads to significant reduction in data.
It is obvious that while fetching data from the existing table, it would be a performance issue, if either I fetch the last record from the table or I fetch all the records, whereas in the new design, it reduces a lot.
Pleas share your thoughts on this and help me out with some suggestions.
Thanks,
Jayant
# 1
My ideas:
1.
50*50 (max) is not big.
If you have 2500 *7 rows with 4 columns, or 2500 rows with 10 columns, does not make a real difference.
You might not notice a real difference with reasonable queries.
I undserstand your use case is
SELECT * from MyTable WHERE class=? AND student=?
This would return 7 (or 5 ?) rows with 4 columns in the first case vs. 1 row with 9 (7?) columns in the second case.
I bet you won't notice any difference in performance. The question is rather:
which result is more convenient for you.
If there's no other reason, just leave it as is.
2.
If the table grows dynamically for a year or so (to archive the students' actual attendance?), you'd have to add the date, eventually instead of the weekday. Then you rather keep the first design, and you could eventually skip the Attendance=N records.
# 2
Hard to say why you think it would be big.
Note as well that no normal school is going to have classes mon-fri all of the time. At with some schools classes can occur on other days of the week, such as during finals.
Nor would you put students and classes in the same table.
Nor do you don't really need to mark both present and absent, you can get by with just one.
So probably a minimal (which ignores some realities like historical data)
Tables:
class - defines class details like name, has primary key class_id
student - defines student like name, has primary key student_id
schedule- foreign key to class (class_id) and each day class will be held, has primary key schedule_id
attendance - foreign key to student_id, schedule_id and field indicating present or not.