2017年7月30日星期日

Data Structure and Algorithms Show (6): Quick Sort

Quick Sort is my favorite sort algorithm. The reason is simple: Quick Sort is efficient enough and easy to understand and implement, in recursive way or not.

Like Merge Sort, we applies sub-grouping concept here as well. The difference is that we select a pivot randomly in the sorting array first, then put all the elements smaller or equal to a left sub group and put rest of elements to a right group. The pivot will be in the middle right between two sub groups. Very easy, right? Finding a element as a pivot, smaller ones go left and bigger ones go right. Now, we have two small groups, each group repeats the process again, until only zero or one element in a sub group. If you are using recursive method, this is the condition for your method to exit the loop and return.

During the animation show for Quick Sort below, I select first element of a group as pivot and mark pivots as black rectangle after I sub-group rest elements. Different sub-group will be filled with different colour in their rectangle. The animation will make sense to you once you understand Quick Sort algorithm I mentioned above.

$scope.quickSort = function(list){
    if(list.length <= 1){
        return list;
    }
    var left = [];
    var right = [];
    var pivot = list[0];
    for(var i = 1; i < list.length; i++){
        if(list[i] <= pivot){
            left.push(list[i]);
        }else{
            right.push(list[i]);
        }
    }

    return $scope.quickSort(left).concat(pivot).concat($scope.quickSort(right));
}

Of course, you still can implement Quick Sort in a non-recursive way. How, I hope you can figure it out by yourself. Hint: instead of depending on JavaScript engine put your env into a stack, you can have your own stack to push and pop sub group information for future usage. Anyway, if you want to see the answer directly, press F12, all the sort animation source codes are available in current web page. Yes, I have to show the Quick Sort animation in a non-recursive way. Don't be scared by AngularJS and D3 related codes, just focus on quickSortShow() function.

2017年7月28日星期五

Data Structures and Algorithms Show (5): Merge Sort

$scope.mergeSortTopDown = function(list) {
    // return the one element array as a recursive exising condition.
    if(list.length < 2){
        return list;
    }
    var middle = Math.floor(list.length / 2);
    var left = list.slice(0, middle);
    var right = list.slice(middle, list.length);
    // continue to split into two sub arrays in a recursive way.
    left = $scope.mergeSortTopDown(left);
    right = $scope.mergeSortTopDown(right);
    var retList = [];
    var i = 0;
    var j = 0;
    // concat two sub arrays together in order.
    while(i < left.length && j < right.length){
        if(left[i] <= right[j]){
            retList.push(left[i]);
            i++;
        }else{
            retList.push(right[j]);
            j++;
        }
    }
    // if there is more elements left in left sub array.
    while(i < left.length){
        retList.push(left[i++]);
    }
    // if there is more elements left in right sub array.
    while(j < right.length){
        retList.push(right[j++]);
    }
    console.log(retList);
    return retList;
}
$scope.mergeSortBottomUp = function(list){
    var step = 1;
    var pos = 0;
    while(step < list.length){
        while(pos < list.length){
            $scope.mergeCompare(list, pos, step);
            pos += step * 2;
            console.log(list);
        }
        pos = 0;
        step *= 2;
    }
}

$scope.mergeCompare = function(list, pos, step){
    var left = list.slice(pos, pos + step);
    var right = list.slice(pos + step, pos + step * 2);
    var i = 0;
    var j = 0;
    var k = pos;
    while(i < left.length && j < right.length){
        if(left[i] <= right[j]){
            list[k++] = left[i++];
        }else{
            list[k++] = right[j++];
        }
    }
    // if there is more elements left in left sub array.
    while(i < left.length){
        list[k++] = left[i++];
    }
    // if there is more elements left in right sub array.
    while(j < right.length){
        list[k++] = right[j++];
    }
}

Last time we already knew the secret of one of the advanced sort algorithms, Shell Sort. Basing on Insertion Sort make Shell Sort a little bit more complicated than those simple sort algorithms but more efficient. Nevertheless, finding a best gaps array is not a easy task, which cause a rare application of Shell Sort. So, let us move onto another advanced sort method, Merge Sort.

In fact, I want to tell you that we have begun to cross a line of thinking and understanding sort algorithms. Because the advanced sort algorithms we will see, like Merge Sort and Quick Sort, all are available to involve with a new concept called Recursive Process; And sub-groups a big array to smaller groups to simpler the sort process step by step. Yes, Sub-Grouping is the key to understand these sort algorithms. And This theory makes sense in real life too. If you got a big problem, knock it down to several smaller problems first, then each small problem can be handled easier and you conquer the big problem when you finished every small ones.

Focusing on Merge Sort. The sort algorithm is called this name for a obvious reason, the data list waiting for sorting will be divided into sub-lists, and the sub list will be divided into sub-lists as well. The division process will continue until the sub list only has one element inside. Now, the merging part begin. every two sub lists will be merge together in order, which means the merged list will have smaller elements on the left and larger elements on the right. The merging process will repeat until we get a data list includes all the elements again.

As we mentioned before, we can use recursive process to control the sub-grouping. Recursive Process means a function invokes itself repeatedly. Of course, there must be a condition inside the function to tell the function to stop invoking to avoid dead loop. We can see that Merge Sort is a good scenario to apply Recursive Process. Sort function will keep dividing the list until the sub list only has one element inside. We call this kind of Merge Sort is in a Top-Down way.

Unfortunately, if the list is huge enough, Top-Down way is not a very good choice, because Recursive Process need to consume a lot of stack space. Especially for JavaScript, your web browser will not be happy if you have a big array need to sort in Top-Down way. The good news is, most of the Recursive Process can be achieved in another way, Non-Recursive Process. Glad to tell you that Merge Sort has a Bottom-Up way to implement. In this way, we handle every single element in the array first, and begin to merge every two of them, them four of them, until the whole list being remerged.

In our animation show, we are using Bottom-Up Merge Sort. The reason is simple, I cannot stop the Recursive Process in a cheap way and it is controlled by JavaScript engine of web browser. But don't worry, I will give you both the sample scripts for Bottom-Up and Top-Down.

Because it is Bottom-Up Merge Sort, I will take every element in the list as a sub list at the beginning and every sub list has its own background colour. When I merge two sub lists together, a new sub list will be generated and a new colour will fill the background of elements to present a new merged list. Enjoy!

2017年7月25日星期二

Data Structures and Algorithms Show(4): Shell Sort

Shell Sort has no relationship with shells in the ocean at all. This name comes from Donald Shell who created this algorithm several decades ago. If you read the sample code, it is easy to find out that this sort algorithm is based on Insertion Sort. The big difference inside Shell Sort is that gaps concept injection.

We have already known that comparison always happens between two adjacent elements and exchange them if necessary. What is going to happen if we compare two elements far away to each other? I know you do not feel much difference with this idea, but the gaps between the elements comparisons make the sort more efficient. This kind of comparison and long distance movement can reduce a huge amount of disorder very fast.

For example, we can define a gaps array which include three gap values: [3, 2, 1]. Then, we go through Insertion Sort three times:

  1. Compare the elements with gap size 3: element 4 compare to element 1; element 5 compare to element 2; element 6 compare to element 3; element 7 compare to element 4; element 7 compare to element 1 (7-3-3=1); element 8 compare to element 5; element 8 compare to element 2; to the end of the array.
  2. Compare the elements with gap size 2: element 3 compare to element 1; element 4 compare to element 2; element 5 compare to element 3; element 5 compare to element 1(5-2-2=1); element 6 compare to element 4; element 6 compare to element 2; to the end of the array.
  3. Compare the elements with gap size 1: This is the original Insertion Sort.
I believe you just began to realize that there is a problem of Shell Sort. What are the best gaps to define? [3, 2, 1] or [256, 100, 40, 15, 6, 2, 1]? Unfortunately, the answer is still in the air. Some people do find some rules to get a reasonable gaps array, but with different data, we have no unified answer for it.

I think that is a reason why Shell Sort has no a big role in real application world. But spending some time on it is worthy. Now we understand that making a difference to those existed algorithm can generate a better (or worse) one. Anything is possible.
var gaps = [3,2,1];
for(var g = 0; g < gaps.length; g++){
    for(var i = gaps[g]; i < list.length; i++){
        var tmp = list[i];
        var insPos = i;
        for(var j = i; j - gaps[g] >= 0; j -= gaps[g]){
            if(tmp < list[j-gaps[g]]){
                list[j] = list[j - gaps[g]];
                insPos = j - gaps[g];
            }else{
                break;
            }
        }
        list[insPos] = tmp;
    }
}

2017年7月20日星期四

How to Make Two or More Persistence Units in one persistence.xml Work in Your JPA Related Project

JPA is not a new concept for the people who works with Java and J2EE related technologies. In fact, as a specification, JPA has a lot of different implements. For example, Wildfly from JBoss, Spring and Hibernate. To use JPA in a project, we need a file called persistence.xml which defines data sources, JDO classes, data operation details and some special properties. You can find tons of simple samples of persistence.xml on Internet, but most of them only include one data source inside. So, we are going to talk about something different, a little bit more complicated scenario here: Define two data sources inside one persistence.xml. And we will mention something you may need to know to handle this kind of persistence.xml.

Let us start with a sample first.

<?xml version="1.0" encoding="UTF-8"?>
<persistence version="2.1" xmlns="http://xmlns.jcp.org/xml/ns/persistence" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/persistence http://xmlns.jcp.org/xml/ns/persistence/persistence_2_1.xsd">
 <persistence-unit name="user-module" transaction-type="JTA">
         <description>RFID Module Persistence Unit</description>
  <jta-data-source>java:jboss/datasources/DS1</jta-data-source>
  <class>org.jsgarage.example.User</class>
  <class>org.jsgarage.example.Role</class>
  <class>org.jsgarage.example.Permission</class>
  <exclude-unlisted-classes/>
  <properties>
   <property name="hibernate.hbm2ddl.auto" value="update" />
   <property name="hibernate.show_sql" value="false" />
   <property name="hibernate.format_sql" value="false" />
   <property name="hibernate.use_sql_comments" value="false" />
   <property name="hibernate.jdbc.fetch_size" value="1000" />
   <property name="hibernate.jdbc.batch_size" value="50" />
  </properties>
 </persistence-unit>
 <persistence-unit name="delivery-module" transaction-type="JTA">
  <jta-data-source>java:jboss/datasources/DS2</jta-data-source>
         <class>org.jsgarage.example.Article</class>
         <class>org.jsgarage.example.Order</class>
         <class>org.jsgarage.example.Product</class>
         <exclude-unlisted-classes/>
         <properties>
             <property name="hibernate.hbm2ddl.auto" value="validate" />
   <property name="hibernate.dialect" value="org.hibernate.dialect.SQLServer2008Dialect"/>
  </properties>
 </persistence-unit> 
</persistence>

Let us take a look what we have in the persistence.xml.

Element
Attribute
Properties
Description
persistence-unit Includes data source, JDO classes and special properties. Defines transaction management type.
transaction-type JTA means you are using Container Transaction Management. Such as inside a J2EE container. RESOURCE_LOCAL means you manage transactions by yourself.
jta-data-source Data source defined inside an application server can be referred with a JNDI naming.
class class listed here should be the one with @Entity annotation.
exclude-unlisted-classes Without this element showing up, your JPA implement may not be able to finish its job because it try to map all the classes can be found in the persistence.xml, not matter those classes belong to which unit. Ridiculous, right?
hibernate.hbm2ddl.auto update means your JPA implement will try to update database table schemas with your JDO class entity definition. validate means the schemas will be checked to make sure consistent.
hibernate.dialect Let your JPA implement know what kind of Database is behind the data source.
Note: some simple and easy to understand properties have been ignored. You alwarys can find some official explaination with Google.

It looks like simple enough, right? But be ready, there are some rules you need to obey if you want to have more than one data source included in your persistence.xml:

  • Make sure you have <exclude-unlisted-classes> element within your <persistence-unit>. Otherwise, the unit will try to grab all the classes inside this xml file and fail to map those tables which are not existed in this unit related data source. Of course, if your design is to have exact the same schemas in all the data sources defined in this xml file, it would be fine. But I would be very curious about the purpose.
  • Set property hibernate.hbm2ddl.auto value carefully. "update" will generate or alter the table schemas in your database automatically during the initial mapping process. It is a risk to apply in a production environment. Sure enough, if you have no <exclude-unlisted-classes> defined here, you may be glad to see all the data sources will have same tables inside. "validate" will not cause this mess, but any difference between your JDO class entity and table schemas in DB will fail the initial mapping process.
  • Property hibernate.dialect can help your JPA implement to know what kind of DB behind the data source. Although the definition may include the database type with it, but one brand may have a huge range of different versions. If you JPA implement need it, let it know.
After you defined persistence units in your persistence.xml, don't forget to use annotation to make them work in your java project. And you can define your own annotations to distingguish the different EntityManager instances during the @Inject process. Example here:

 @Produces
 @UserDBQualifier
 @PersistenceContext(unitName = "user-module")
 private EntityManager emUser;
 
 @Produces
 @DeliveryDBQualifier
 @PersistenceContext(unitName = "delivery-module")
 private EntityManager emDelivery;

At last, if you are using Wildfly and Switchyard, congratulations, Switchyard module may not be able to let you use Container Management Transaction if your codes are not part of the Switchyard Service or not a interface defined on the module. But do not worry, UserTransaction can serve you well, like this:


 @Resource(lookup = "java:jboss/UserTransaction")
 private UserTransaction userTransaction;

2017年7月18日星期二

Data Structures and Algorithms Show(3): Insertion Sort

Insertion Sort gets its name because every element in the array try to find a position closer to the start position of the array. The movement is just like a insertion action. The sort process start from the second element in the array. if the element is smaller than those elements before it, then those elements will be shifted toward its position. As a result, this element can be inserted to the place which is empty after the shift. The sort process completes when the last element in the array go through the steps.

With the first glance, it may look like this sort algorithm that is better than Bubble Sort but worse than Selection Sort. Surprise enough, Insertion Sort is even a little bit better than Selection Sort.

Any way, we need to know that all these three sort algorithms are not using by most of the programs in our computer because they are not efficient enough. They all need two layers loops to compare and move elements. In the worse scenario, an array with n elements may need n*n times steps to achieve the result. It could be 1 million if n is 1000.
for (var i = 1; i < list.length; i++) {
var select = list[i];
var insPos = i;
for (var j = i - 1; j >= 0; j--) {
if (select < list[j]) {
list[j + 1] = list[j];
insPos = j;
}else{
break;
}
}
list[insPos] = select;
}

2017年7月17日星期一

Data Structures and Algorithms Show (2): Selection Sort

In my last blog (http://notaiyet.blogspot.com/2017/07/a-fantastic-way-to-show-and-teach-data.html), we have a Bubble Sort Show to help us to understand what is bubble sort and the basic theory of sort action: comparison. Although bubble sort is a good example to explain the comparison during the sort process, it is not a good sort algorithm. You may have known the reason that there are too many times exchange operations happen from the beginning to the end.

If you have n elements need to be sorted, in the worst case, bubble sort might need n*n times' comparisons and n*n times' exchange operations.

Fortunately, Selection Sort can help a little bit with less exchange actions despite n*n times' comparisons is still a have to. 

In selection sort, there are n rounds of comparisons if n elements exist in the array. the first round comparison starts with the first element in the array. The position of smaller element during the comparison will be recorded. The position of the smallest element can be located after all the elements have passed through the comparison. Then the smallest element will be placed into first position of the array. The second round will start with the second element in the array and go through the same process. The sort is done when the last second element finish the comparison and the exchange. In worst scenario, selection sort only need n times' exchange, not n*n times. No doubt, This is a better sort algorithm then bubble sort. Although it is still not good enough.

So, finding out the smallest(or the biggest) element in the array is a key conception of Selection Sort. Now, we are ready for the Selection Sort Show. You will see how the smallest element can be located during the every round of comparison and be put to the right place.


for (var i = 0; i < list.length; i++) {
var min = i;
for (var j = i; j < list.length - 1; j++) {
if (list[min] > list[j + 1]) {
min = j + 1;
}
}
// need to put the smallest element to the front if the front one is not the smallest one.
if (min != i) {
var temp = list[min];
list[min] = list[i];
list[i] = temp;
}
}



It is not difficult to realize that Selection Sort is a little bit faster than Bubble Sort, right?

2017年7月12日星期三

Data Structures and Algorithms Show (1): Bubble Sort

AI has been getting closer and closer to human life. I believe it is a time for everybody to know more about the computer related technology. Just like we need to know how to do shopping or even works online which are not necessary skills half century ago. People need to be able to understand how the things going on inside all kinds of hardware like mobile phone, PC, vehicle and so on.

Of course, currently, Artificial Intelligence is not smart as human brain yet. In fact, the way that AI thinks is quite different with human brain. Although I believe Google and other tech giants are trying to make AI more like human brain, AI still deeply depends on data structures and algorithms which were designed by human. So, let us say, learn and understand data structures and algorithms will give you a opportunity to know how incredible results can be achieved by computers and software inside.

Unfortunately, the only thing can be understood by machines are 0 and 1. Human cannot think like a machine and vice versa. So some great engineers invented middle languages to fill the gap between human and machine. These languages abstract the commands can be performed by machines and let normal people have easier way to work with machines.

As a blogger who uses computer everyday, dig a little bit deeper than a normal user into machines' world would be helpful to the blog works. Personally, I reckon JavaScript is the best choice to start as everyone has a web browser to surf today. And I want to show and teach data structures and algorithms in a interactive way, to see if we can make the process more understandable. After all, those languages and conceptions are not totally same with stuffs in human world.

Enough blabs. Let us start with sort. It looks like that Sort is not a natural thing exists in the universe. Human sort things for many different purposes, such as management and science research. What is the core part of the sort? I bet you know it. Yes, it is comparison. The simplest example of sort is numbers, we have the ability to sort numbers in ascent or descent way. Computer do the sort things with the same method. Want a more complex example? How about a group of people? Of course, people are not number, how to work it out? Simple, Instead of sorting people, we sort people's attributes. Sorting by age, weight, or even name. It all depends on the requirement.

In machines, a group of numbers or objects(represented with a lot of 0 and 1) waiting for sorting are normally stored in a list. We call this list Array if all the members in the list all sit together like a line. Have to say, Array is the most fundamental data structure in this domain. We will start with the simplest(and the slowest maybe) sort algorithm which called Bubble Sort.

Just like the name, Bubble Sort compare two adjacent elements from the beginning of the Array, exchange two elements if the second one is smaller than the first one (or vice versa. depends on ascending or descending order we want). Then compare the second and third elements until the last two elements compared and exchanged if necessary. If you have (n) elements in the Array, repeat the process (n-1) times. And every time the biggest or smallest number will float to the end of the Array, which means the number does not need to be compared next time.

Now, the key part is coming, let us use D3 and AngularJS to show the whole process. I hope it is easy to explain the algorithm and can learn Bubble Sort, JavaScript, D3 and AngularJS from it.



for (var i = 0; i < list.length - 1; i++) {
for (var j = 0; j < list.length - 1 - i; j++) {
if (list[j] > list[j + 1]) {
var temp = list[j];
list[j] = list[j + 1];
list[j + 1] = temp;
}
}
}

Bubble sort is very simple sort algorithm and should be easy to understand. I have show the whole process of the sort and the example codes. I hope you can have an idea how computer do the magic for us.

If you are interested, you can press F12(on PC) to see the JavaScript source code in this blog page. In fact, to achieve the animation, D3 and AngularJS help a lot.

To let blogger support AngularJS, you need to modify the blog theme to put AngularJS libraries into the template. You can check my another blog to learn how to do it (http://notaiyet.blogspot.com/2017/06/make-your-own-javascript-codes-work-on.html). The following content should be included in the template:
<link rel="stylesheet" href="https://ajax.googleapis.com/ajax/libs/angular_material/1.1.0/angular-material.min.css" />
<!-- Angular Material requires Angular.js Libraries -->
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular-animate.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular-aria.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/angularjs/1.5.5/angular-messages.min.js"></script>

<!-- Angular Material Library -->
<script src="https://ajax.googleapis.com/ajax/libs/angular_material/1.1.0/angular-material.min.js"></script>
<script src="https://d3js.org/d3.v4.min.js"></script>

There is one more thing I want to mention: keep your blog page a backup somewhere. For some reason, blogger will change your HTML a little bit. In that case, some AngularJS tags may be moved to some unwilling places and cause some problems. But you always can go to the HTML editor page in blogger and paste correct HTML code from your backup and save it directly. Remember to check the preview page to confirm that "Compose" editor do not cause any damage to you.

Good luck to your digital diving journey!