- This is a thesis submitted to Graduate School of Science and Technology, Computer Science and Electrical Engineering in Kumamoto University, Japan, on September 2017 in partial fulfillment of the requirements for the degree of Master of Engineering but was not published thus the copyright remained with me "Fajar Purnama" the main author where I have the authority to repost anywhere and I claimed full responsibility detached from Kumamoto University. Therefore, I hereby declare to license it as customized CC-BY-SA where you are also allowed to sell my contents but with a condition that you must mention that the free and open version is available here. In summary, the mention must contain the keyword "free" and "open" and the location such as the link to this content.
- The presentation is available at Slide Share.
- The source code is available at Github.
Below are the publications reused in this thesis that does not require copyright clearance:
- Hand Carry Data Collecting Through Questionnaire and Quiz Alike Using Mini-computer Raspberry Pi 
Below are the publications reused in this thesis that requires copyright clearance:
The continuous advance of electronics and information communication technologies (ICT) have influenced every aspects greatly, on this thesis is discussed on education aspect. Electronics and ICTs have been incorporated into the learning and teaching process, giving birth to electronic learning (e-learning). Inside, there is a well known term called online course where the essence is being able to deliver courses distantly with flexibility in place and time. However a simple condition must be met in order to implement online course, and that is the sufficiency of ICT infrastructure. Unfortunately not all regions met this condition, limiting the accessibility of online course. Other than improving the ICT infrastructure, distributed learning management system (LMS) was proposed as alternative, but the next issue was the maintenance or synchronization, which in this case is keeping the learning contents up to date. There are two problems highlighted in this thesis which are unable to perform synchronization in severe network connectivity region, and duplicate data transfer during synchronization.
To overcome the synchronization in severe network connectivity region the solution is utilizing hand carry servers. By implementing hand carry servers on distributed LMS will grant mobility to the servers of distributed LMS. The concept proposed was having the hand carry server to physically seek network connectivity to perform online synchronization, and afterwards returns to its original location. The hand carry server was proved to be portable due to its small size, light weight, and also power consumption where a power bank is enough to supply for a whole day. Although it has resource limitations in terms of computer processing unit and random access memory which limits its performance.
To overcome duplicate data transfer during synchronization incremental synchronization was utilized instead of full synchronization. Also on this thesis introduced a new approach called dump and upload based sychronization which was to overcome the obstacles of different LMSs and LMS versions faced by dynamic content sychronization.
Electronics and Information Communcation Technology (ICT) have made many tasks more convenient, including delivering education. It can be seen that many have incorporated electronics in their learning and teaching process. There are few examples such as teachers using laptops and projectors to present their materials, students browsing the Internet to search for informations, and both of them using emails, chats, or social networking service to communicate. These kind of things are agreed to be called electronic learning (e-learning) which can be illustrated on Figure 1.1
Figure 1.1 Illustration of e-learning showing many electronic devices to be used (imagesfrom openclipart ).
Though, this thesis will not discuss widely on e-learning, but a category which is part of e-learning called online course. It uses electronic ICT devices where information exchange can be done remotely. Information can be delivered through electrical signal in high speed on the network, preferably on the Internet, and computer devices as end devices or as transmitters and receivers. Simply computer devices connected to the Internet are all that are needed to participate in online course from anywhere at anytime illustrated on Figure 1.2.
Figure 1.2 Illustration of the difference between conventional course and online course. While conventional course is restricted by place and time, online course can be anywhere and anytime (images from openclipart ).
Online course is now being highlighted by many parties, seeing them as one solution to the unevenly distribution of education. Straighfowardly not everyone have access to good quality education, furthermore there are also those who does not have access, and by using online course people can receive education without going to school. Knowing this, our peers tried to implement online course in their Universities, one in Indonesia  and the other one in Myanmar . Another peer already have online course well built in Mongolia and now moving to massive open online course (MOOC) . Unlike private online course only for students in Universities, MOOC is open for anyone indiscriminately. In the United States (US) MOOC is also being used to scout for potential students. For example Massachusetts Institute of Technology (MIT) found a genius Mongolian highschool student who perfectly ace its Circuits and Electronics MOOC, then took him as a freshmen student . In summary many people saw bright future in utilizing online course in education.
With all the benefits of online course, there are still problems preventing many people from enjoying it. The problem was the lack of accessability to online course due to insufficient ICT infrastructure. In other words there are people who are having network connectivity issue especially in developing countries. On random survey by Kusumo et al.  on students in Indonesia, 60% of them agreed that Internet connection is still problematic. The survey by Monmon et al.  of e-readiness on Yangon Technological University and Mandalay Technological University in Myanmmar showed lower Likert scale scores on the students' and teachers' perception on ICT network compared to other items. Today the world Internet penetration is still around 50% indicating that only half of the world's population can access online course . Eventhough these people have access, their access quality may still be questionable which can lead to disatisfaction in accessing online course.
The obvious solution to accessibility issue is to improve the ICT infrastruture, however this takes a long time. Therefore another method was implemented, which is implementation of distributed system rather than centralized system. The concept is to have the people to access the service on their local area that is distantly closer than on the central area that is distantly further. In some references, it is stated as the third generation of content management system (CMS) , thought on this work is more about learning contents of Learning management system (LMS) than general contents of CMS.
With distributed LMS as the solution to the lack of accessability of online course, it is the next problem which is discussed on this thesis. The problem is the synchronization which is to keep the learning contents up to date. This can also be said as the maintenance of the learning contents. Specifically there are two problems highlighted on this thesis as follow:
- The lack of network connectivity for synchronization. Usually synchronization are set to be done online where the servers synchronizes with another in order to keep the learning contents to its latest version. If this was the case then synchronization is not possible on no network connectivity condition.
- Duplicate data transfer during synchronization. In default full synchronization is used, where the learning contents is usually in bundle of courses. Commonly when the contents of the course is revised on LMS, the whole contents of course is distributed to other servers including previously distributed contents (duplicate data). In this case, there will be many redundant data which will add more burdens to the network.
This thesis provides two main solutions for the two problems:
- For the first problem of no network connectivity, the solution is to provide portability function to distributed LMS. Straightforwardly enabling the servers to move to other locations where there is network connectivity to synchronize, and to return to its original location after finish synchronizing.
- For the second problem of duplicate data is to utilize incremental synchronization through continuous differential synchronization technique. The new contents are to be identified before synchronization and only the new contents are distributed, leaving out the redundant data.
Detail significances are discussed in further sections, but in general can be mentioned as follow:
- Possibility of flexible synchronization in severe network connectivity region by mobilizing the servers of distributed LMS. It can also be pictured as widening the network coverage.
- Lower network cost can be achieved from incremental synchronization.
The objective of this research is to enable online synchronization of distributed LMS in almost no network connectivity region and reduce redundant data transfer during synchronization.
- Introduced a novel concept of integrating hand carry server to distributed LMS which makes it mobile or portable resulting in able to perform synchronization in regions with severe network . This thesis also demonstrated the portability of hand carry servers' through conducting survey simulation and on the other hand also showed its limitation through stress testing .
- Though the novelty of incremental synchronization in distributed LMS was already claimed , this thesis showed a different approach call dump and upload based synchronization  which the advantages of its single software application is compatible to most LMS and benefits the feature of that LMS, for example its privacy and security feature which automatically makes the synchronization private and secure, and on Moodle possibility of partial synchronization due to micronization of course contents into blocks. Another advantage is this approach supports bidirectional synchronization.
Each method may have limitations which is discussed in detail on each of their respective sections, but here is mentioned the general limitation of this research:
- The system is only experimented in laboratory and not yet implemented in real running online courses. The experiment is done on the author's virtual machines, laboratory's local area network (LAN), and free public clouds owned by the author.
- Only one hand carry server was used in actual experiment and the expansion discussed of using more the one of it is still a concept derived from the experiment.
- This thesis' dump and upload based incremental synchronization is novel in its concept but not in its software application since it only make use of existing software applications. They are the export and import feature in LMS to dump the learning contents and rdiff application based on rsync to identify the difference between dumps.
- The course experimented on is the authors self created course which was never delivered, in short it is not an actual running course.
1.8 Structure of the thesis
Beyond this section the thesis contains three more chapters:
- Chapter 2 discussed about portable distributed LMS which in order gives brief introduction to distributed LMS, afterward is the author's work in showing the convenience of hand carry server , the concept of hand carry server in distributed LMS , and laslty the hand carry server's limitations.
- Chapter 3 discussed about incremental data synchronization which in order the story of sharing learning contents, distinguishing full synchronization to differential and incremental synchronization, discussion of the previous work of dynamic content synchronization  versus the author's work of dump and upload based synchronization , and finally experiments and results showing the percentage of duplicate data eliminated on incremental synchronization.
- Chapter 4 is the conclusion of this thesis that also discussed the future work.
2 Portable Distributed LMS
2.1 Distributed Systems
2.1.1 Partitioned System
Distributed systems can be a wide discussion with different implementation . One implementation can be as partitioned system. For example, an organization's network can have their servers separated, where the database, directory, domain name service (DNS), dynamic host configuration protocol (DHCP), file, web, and each other servers on separated machine. They are integrated but independent where if one service (server) is damage, will not damage other services. A different example is data partitioning where data are fragmented that when retrieving data, they have to be gathered and merged. This usually happens in collaboration where people are working on the same project but from different machines.
2.1.2 Replicated System
Another implementation can be as replicated system, and this is the one that is referred or used on this thesis. The urgency for replicated system can be due to bottleneck traffic or geographically severe network connectivity, or both. One of the most popular implementation is search engine like Google and Yahoo where they have different server locations assigned with local domains for example .co.jp for Japan, .co.id for Indonesia, and etc. Not as well known as search engines are online multiplayer games. The servers of online multiplayer games can reside on many regions such as Asia, Europe, United States, China, etc. There are games that shows the number of population on each servers indicating whether it is full or not. Players can choose other servers when a server reached the population limit or when players cannot actually reach the server on that region.
2.2 Distributed Learning Management System
One definition of LMS is a system that manages the learning and teaching specifically for online case. The current form of LMS today is a software application. It is not just delivering learning materials to students but online computerize any activities that can happen in a class. Some activities are interractions whether by chat applications or forums like on social networking service (SNS), assignments where this time is submitted electronically through LMS by uploading their files, and quizzes or examinations which can be automatically or manually graded. Not to forget that it can be accessed from anywhere at anytime, and computers are used which can perform much faster and automatic tasks than humans, makes it possible for unique applications, data minings, and learning analytics. In short new features are being developed everyday. Today exists many LMS as on Table 2.1 whether they are open source (free to use, modify, with all the codes open), only available on clouds or software as a service (SAAS) which tends to be freeware/usage only, or proprietary which tends to be business/commercial/paid. On the author's surroundings mostly Moodle is used.
|Open Source||aTutor, Canvas, Chamilo, Claroline, eFront, ILIAS, LAMS, LON-CAPA, Moodle, OLAT, OpenOLAT, Sakai, SWAD, Totara LMS, WeBWorK|
|SAAS/Cloud||Cornerstone OnDemand Inc, Docebo LMS, Google Classroom, Grovo, Halogen Software, Informetica, Inquisiq R3, Kannu, Latitude Learning, Litmos, Talent LMS, Paradiso LMS, TOPYX, TrainCaster LMS,WizIQ LinkStreet|
|Proprietary||Blackboard Learning System, CERTPOINT Systems Inc, Desire2Learn,eCollege, Edmodo, Engrade, WizIQ, GlobalScholar, Glow, HotChalk,Informetica, ITWorx CLG, JoomlaLMS, Kannu, Latitude Learning LLC,Uzity, SAP, Schoology, SSLearn, Spongelab, Skillsoft, EduNxt,SuccessFactors, SumTotal Systems, Taleo, Teachable, Vitalect|
The term distributed LMS means that the replicated servers contains LMS. Each servers are meant to service online course. The implementation can be a full replication where not only learning contents but everything else including activities, assessments, and interractions are synchronized. This means students and teachers can freely use any servers recommended to the one with best network connectivity. The other implementation is partial replication where only non-private data are synchronized, usually only the learning contents. This can happen when there are jurisdictions where each regions are to be handled locally. In other words contents are provided but each schools and universities are still the owner of their own servers and asserts local authorities. Either way distributed system is the solution for bottleneck and connectivity issue. As an illustration on Figure 2.1 in Indonesia, it is better to build and spread more servers compared to have a centralized server in the capital city.
Figure 2.1 Illustration of main benefit of distributed system using ICT penetration map of Indonesia in 2012, where more green regions showed good network connectivity and more red regions showed the opposite. (a) People on regions with more red colored will have difficulty in accessing the central server. (b) On the other hand people will have not difficulty in accessing if there are servers on their local regions.
2.3 Hand Carry Server in Distributed LMS
After the establishment of distributed LMS, the contents needs to be maintained or to be kept up to date through synchronization. However the problem is the lack of network connectivity between servers usually found in deeper areas such as schools in villages. It may be easy to build a LAN but difficult to build connections to other servers or simply an Internet connection on distant places. In a short time it is only possible to build a very limited connection (very low speed) which retrieval of contents may seem to take forever if it is very large. The metaphor is building a server in a jungle, a remote island, or a desert, which are very isolated. The default solution is offline synchronization or the author's solution server mobilization .
2.3.1 Portability of Hand Carry Server
Before discussion of the synchronization, this section would like to introduce hand carry servers. On this thesis it is called hand carry server because the physical hardware is a computer with the size of a regular human hand that has been configured into a server. It is called a mini, pocket size, or portable computer, one example on this thesis is used Raspberry Pi 2 with the specification on Table 2.2.
|A 900MHz quad-core ARM Cortex-A7 CPU|
|1 Giga Byte (GB) Random Access Memory (RAM)|
|4 Universal Serial Bus (USB) ports|
|40 General Purpose Input Output (GPIO) pins|
|Camera Serial Interface (CSI)|
|Display Serial Interface (DSI)|
|Micro Serial Digital (SD) card slot|
|Video Core IV 3D graphics cire|
|Size of 85.60 mm × 56.5 mm (3.370 in × 2.224 in), not including protruding connectors|
|Weight of 45g|
The portability was demonstrated on one of the author's previous work . It is less related to distributed system but it showed applications of hand carry server in manual labors which on that work is a simulation comparing between paper based method survey to hand carry server method survey. The motivation was the lack of Internet connection to perform online survey but most people owns a computer devices in developing countries   . Instead of reverting to paper based method, the participants' personal digital assistants (PDAs) can be utilized by connecting them to the hand carry server and perform a semi-online survey illustrated on Figure 2.2.
Figure 2.2 Illustration of using hand carry computer device to gather informations from other users inputed from their own computer device .
For the simulation a MOOC readiness survey . consist of 30 questionnaire items was simulated on 30 participants by a surveyor. The whole survey consists of three stages; preparation, responding, and post survey. On the preparation stage, for paper based method the surveyor creates the questionnaire items on word processing software then print them, while for hand carry server method the surveyor creates the questionnaire on web based survey application called Limesurvey CMS. On responding stage, for paper based method the surveyor hands out paper to each participants and collect them when they are finish responding, while for hand carry server method the surveyor tells the participants to connect their PDAs to the hand carry server and informs the URL of the local survey site, then waits until the participants submits their results to the hand carry server. Though results on Figure 2.3 showed no difference in time consumption for preparation and responding stage, paper based method tends to burden more on labors such as printing the questionnaires (time taken multiply greatly using old printers) and carrying heavy papers if there are alot of participants. On the other hand resource is the main issue for hand carry server which will be discussed on Limitation of Hand Carry Server section.
Figure 2.3 Time consumption of survey process from preparation, responding, to post survey . (a) For paper based method the preparation consists of question typing and question printing, responding consists of question distribution, question answering, and response collection, and Post Survey consists of response insertion. (b) For hand carry server method the preparation consists of question typing with web delays, responding consists of server connection, question answering with web delay, and the advantage of this method is no need for post survey which the response already automatically inserted.
However the advantage was shown on the post survey stage where usually the surveyors have to input the responses into the database, not to forget to also handle human errors by verifications such as double checking which seems to be the most stressing and tiring proses of paper based method. It is different from hand carry server method where the responses are automatically processed, literally no post survey stage. In fact results/statistics are instantly visible which no manual method can outfast. The participants can see the current statistics the moment they submitted the responses as exampled on Figure 2.4.
Figure 2.4 Data in form of bar graph and pie chart was shown the instance the hand carry server received the responses . Only 4/30 item result shown here since it is too much to show all.
The author's work mostly discussed the convenience of computerization but the important part is the mobility or portability . Back on Figure Figure 2.2., the hand carry server can be carried anywhere (a walking/moving server) which only needs a power supply of direct current (DC) 5V (volts) potential difference and 2A (amperes) electric current, usually a hand carry power bank is enough. On the simulation is also measured the current delivery was 0.6AH (ampere hour) in 39 minutes (whole duration of survey, see Figure Figure 2.3) meaning with the powerbank's specification of 20000AH it will last 20 hours. In short the hand carry server is low power cost that can last longer during mobile.
2.3.2 Synchronization in Severe Network Connection
Currently synchronization have to be to taken offline when there is no network connectivity whether they are full or incremental which will be discussed in next chapter. An administrator will go to network connected or directly to the updated server to retrieve the contents and store in a storage media such as compact disc (CD), and flash drive. Then travel back to the outdated server, insert the storage media and give the contents. There is a work by Ijtihadie et al.  for differential update where it was sent through email then differentially update the contents. It should be possible to put the differentials into a storage media which then to be inserted into the outdated server to update the contents.
Figure 2.5 Illustration of moving hand carry servers where they have to move to a location with network connectivity to synchronize with main server, and return to original location after finishing .
Another way is to move the servers to an area with connectivity, have it update, and then return it to its original location . This was actually inspired by Ijtihadie et al.  where the students downloads the quiz on their mobile devices, answers them offline at their homes, and later finds an Internet connection to synchronize (automatically upload their answers). This concept was applied to this thesis' work where the process happens to the hand carry server instead of the mobile device. It is illustrated on Figure 2.5 with currently people carrying the servers. An example of implementation is on Figure 2.6. There are regions in Indonesia which does not have goot network connectivity rendering difficult to synchronize with other servers. If those servers are replaced with hand carry servers, then it can physically move to find network connectivity (it supports wired and wireless connection) to synchronize, and in the end return to its original location.
Figure 2.6 Implementation illustration of hand carry server on distributed LMS in Indonesia. (a) Servers on more red areas have difficulty on their network connectivity. (b) Replacing those servers with hand carry servers renders them to be physically mobile and able to search for network connectivity.
Within the distributed LMS, the servers can either be replaced with hand carry servers or leave them mounted and have hand carry servers as addition or support, meaning the hand carry servers will travel from servers to servers. It is temporary implementation when there are no network infrastructures built, since it is fast and simple to install, or it can serve as a purpose to cover network coverage holes where the hand carry server moves around these network uncovered area.
2.4 Limitation of Hand Carry Server
With the compressed size and light weight of hand carry server, it has resource limitation. The resources responsible for servicing are mainly computer processing unit (CPU) and random access memory (RAM) (detailed specification can be seen back on Table Table 2.2). As shown on Figure 2.7 the CPU and RAM are already exhausted when 30 participants attempts the survey . These measurement result alone may not show much meaning, but can be meaningful if stress testing is conducted as on next subsection.
Figure 2.7 Resource usage during survey attempted by 30 users showing mostly over 80% of CPU usage and around 700MB of RAM usage .
2.4.2 Stress Testing
Experience users may completely understand by just showing the resource measurement results, but others will have to feel, rub, and take few trials to see how far this hand carry server is actually capable. For that reason, stress testing was proposed and conducted. Though it was tested for survey purpose , but the method can be applicable for other applications. For the stress testing, a web stress testing software application called Funkload was used. Different numbers of virtual users incrementally 10 up to 100 was generated and attempts survey on the hand carry server simultaneously Illustrated on Figure 2.8. This time only response time was measured.
Figure 2.8 Stress testing illustration using Funkload software application that generates up to 100 virtual users to stress the hand carry server (images from openclipart ).
Response time can be refered to service time, in this case how much users takes to load questionnaire items and to submit responses. The service time can also be called queuing time where there are users who takes shorter time and users who takes longer time as on Figure 2.9 are shown the average response time and the maximum response time (the user on the last queue). It shows that the response time increases to the number users and also increases when the questionnaire content size increases because it will affect on the number of questionnaire items to be retrieved and how much responses that have to be submitted. Through this results, the surveryor can decide the target average response time and tolerable maximum response time. Then the number of users and questionnaire items simultaneously can be determined. Though the result also showed that the hand carry server have reached its limit above 85 concurrent users and 30 questionnaire items which the service stops working and must be restarted.
Figure 2.9 Stress testing showing increasing response time to increasing number of virtual users and increasing number of questionnaire items , (a) average response time while (b) maximum response time.