Skip to content

Commit 82873d3

Browse files
committed
Merge pull request braydie#7 from gregestren/patch-5
Grammar fix
2 parents 89abeb3 + 72a46ac commit 82873d3

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

en/2-Intermediate/Personal-Skills/11-How to analyze data.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,6 @@ Not so.
66

77
No matter at which stage you start looking at it, data is the main concern of a well designed application. If you look closely at how a business analyst gets the requirements out of the customer's requests, you'll realize that data plays a fundamental role. The analyst creates so called Data Flow Diagrams, where all data sources are identified and the flow of information is shaped. Having clearly defined which data should be part of the system, the designer will shape up the data sources, in terms of database relations, data exchange protocols, and file formats, so that the task is ready to be passed down to the programmer. However, the process is not over yet, because you (the programmer) even after this thorough process of data refinement, are required to analyze data to perform the task in the best possible way. The bottom line of your task is the core message of Niklaus Wirth, the father of several languages. "Algorithms + Data Structures = Programs." There is never an algorithm standing alone, doing something to itself. Every algorithm is supposed to do something to at least one piece of data.
88

9-
Therefore, since algorithms don't spin their wheels in a vacuum, you need to analyze both the data that somebody else has identified for you and the data that is necessary to write down your code. A trivial example will make the matter clearer. You are implementing a search routine for a library. According to your specifications, the user can select books by a combination of genre, author, title, publisher, printing year, and number of pages. The ultimate goal of your routine is to produce a legal SQL statement to search the back-end database. Based on these requirements, you have several choices: check each control in turn, using a "switch" statement, or several "if" ones; make an array of data controls, checking each element to see if it is set; create (or use) an abstract control object from which inherit all your specific controls, and connect them to an event-driven engine. If your requirements include also tuning up the query performance, by making sure that the items are checked in a specific order, you may consider using a tree of components to build your SQL statement. As you can see, the choice of the algorithm depends on the data you decide to use, or to create. Such decisions can make all the difference between an efficient algorithm and a disastrous one. However, efficiency is not the only concern. You may use a dozen named variables in your code and make it as efficient as it can ever be. But such a piece of code might not be easily maintainable. Perhaps choosing an appropriate container for your variables could keep the same speed and in addition allow your colleagues to understand the code better when they look at it next year. Furthermore, choosing a well defined data structure may allow them to extend the functionality of your code without rewriting it. In the long run, your choices of data determines how long your code will survive after you are finished with it. Let me give you another example, just some more food for thought. Let's suppose that your task is to find all the words in a dictionary with more than three anagrams, where an anagram must be another word in the same dictionary. If you think of it as a computational task, you will end up with an endless effort, trying to work out all the combinations of each word and then comparing it to the other words in the list. However, if you analyze the data at hand, you'll realize that each word may be represented by a record containing the word itself and a sorted array of its letters as ID. Armed with such knowledge, finding anagrams means just sorting the list on the additional field and picking up the ones that share the same ID. The brute force algorithm may take several days to run, while the smart one is just a matter of a few seconds. Remember this example the next time you are facing an intractable problem.
9+
Therefore, since algorithms don't spin their wheels in a vacuum, you need to analyze both the data that somebody else has identified for you and the data that is necessary to write down your code. A trivial example will make the matter clearer. You are implementing a search routine for a library. According to your specifications, the user can select books by a combination of genre, author, title, publisher, printing year, and number of pages. The ultimate goal of your routine is to produce a legal SQL statement to search the back-end database. Based on these requirements, you have several choices: check each control in turn, using a "switch" statement, or several "if" ones; make an array of data controls, checking each element to see if it is set; create (or use) an abstract control object from which to inherit all your specific controls, and connect them to an event-driven engine. If your requirements include also tuning up the query performance, by making sure that the items are checked in a specific order, you may consider using a tree of components to build your SQL statement. As you can see, the choice of the algorithm depends on the data you decide to use, or to create. Such decisions can make all the difference between an efficient algorithm and a disastrous one. However, efficiency is not the only concern. You may use a dozen named variables in your code and make it as efficient as it can ever be. But such a piece of code might not be easily maintainable. Perhaps choosing an appropriate container for your variables could keep the same speed and in addition allow your colleagues to understand the code better when they look at it next year. Furthermore, choosing a well defined data structure may allow them to extend the functionality of your code without rewriting it. In the long run, your choices of data determines how long your code will survive after you are finished with it. Let me give you another example, just some more food for thought. Let's suppose that your task is to find all the words in a dictionary with more than three anagrams, where an anagram must be another word in the same dictionary. If you think of it as a computational task, you will end up with an endless effort, trying to work out all the combinations of each word and then comparing it to the other words in the list. However, if you analyze the data at hand, you'll realize that each word may be represented by a record containing the word itself and a sorted array of its letters as ID. Armed with such knowledge, finding anagrams means just sorting the list on the additional field and picking up the ones that share the same ID. The brute force algorithm may take several days to run, while the smart one is just a matter of a few seconds. Remember this example the next time you are facing an intractable problem.
1010

11-
Next [Team Skills - How to Manage Development Time](../Team-Skills/01-How to Manage Development Time.md)
11+
Next [Team Skills - How to Manage Development Time](../Team-Skills/01-How to Manage Development Time.md)

0 commit comments

Comments
 (0)