It’s time to write again

Long time no see.

Really, it’s been a long time since I wrote on this blog last time. I’m still in the software engineering field and has exposed a lot more in performance and parallel computing, cloud and big data since I’ve been working in the bioinformatics field.

I have also invested a lot of time in Hearthstone, a digital card game published by Blizzard – from a casual player to a competitive player, then to the National Champion of China. I tried streaming, commentating in tournaments and now organizing tournaments and events with my local company XGamer. I will put the link here so you can go check it out. It’s in Chinese though.

Just a few months ago, I consulted for a sneaker app business in mainland China. The consultation was very fruitful and I found myself suitable in such position. I’d love to so more consult in the future and I think going back to my blog would be a good start.

So here I am. I will be writing something soon. I have just tried Unity the game engine so maybe I will write a little bit about that. That should be fun.

User Login Design

The Typical Software Design Story

Usually for the first version of most web applications, we have user and user login. The user login is so simple that it is verified by a username and a password. So we design the class like this.

This works pretty well for now. Project manager is happy. The developers are happy.

A month later, the team wants to simplify the registration process so that more users will sign up. your project manager asked you to support login by facebook and google accounts.

You think about it. You think about adding a type attribute to each user. But the normal users and facebook users have different information to store. This could be an obstacle so you move on and continue to think. It is very tempting to extend  User class to  NormalUserFacebookUser and  GoogleUser. They all have the same behaviours except the credential checking. So you go ahead and implement it. Project manager is happy. All developers are happy.

One more month later, we find that the application needs different type of users and you decided that using inheritance is the best design choice. However, we have already use inheritance for different login types. So what can we do?

Inheritance is not as good as you thought

Back in the college years, your professor taught you about Object-Oriented programming, giving you examples of inheritance to achieve polymorphism, code reuse etc. He didn’t tell you about the limitations. Don’t get me wrong. It might not be the fault of your professor since your OO class only lasts a semester and it wouldn’t have enough time to touch those parts.

Composition is the way

So how can we re-design the system here? Turns out it’s not that hard. We add another class for the purpose of authentication.

Common problem in web applications

Since most web applications have to store the user information and authenticate them some time later, this is a quite common problem in the field. When you are building a  User class next time, it could be a good idea to adapt the above design even before you can have different method to login.



Using GitHub for Mac/Windows as a GUI client

I always have a uneasy moment viewing Git diff and logs in the terminals. It’s not easy to navigate to different check-ins since you have to copy and paste the check-in hashes. Recently I reached to the almighty Google god and seek for solution.

GitHub for Mac/Windows

GitHub for Mac/Windows is a GUI Git management tools developed by GitHub, obviously. It doesn’t only allow you to examine your GitHub repositories but also your local repositories.

GitHub for Mac


The UI is neat. Almost everyone can pick up immediately. I suggest you to try it out if you use Git for your projects.


GitHub for Mac
GitHub for Windows


Head First Design Patterns


Head First Design Patterns is a light weight introduction to object oriented design patterns. It approaches each design patterns with different perspectives and compares them to each others.

However, just like another book in the Head First series, this book contains tedious exercises and not-so-funny cartoons which I would normally skip through. Those parts make up 40% of the book which is quite annoying.


Nevertheless, if these don’t bother you, it will still serve your needs to dive into the world of design patterns.

Edit: Some readers misunderstood me. I don’t really hate the book, but it would be much better if the editor can improve the exercises and the way they the story. Other than that, I still think HFDP is a great book that can guide you into design patterns step by step.

PHP: Unexpected behaviour in nested array

It’s a trap!

Let me first show you some code at the beginning.

Guess what it will print?  [["Hi"]] ?


Wrong. It remains  [[]].

Where it begins with

It is a piece modified code of my project. It surprises me that  $aVeryLongNameWhichHoldANestedArray doesn’t change. Isn’t $shortHand  suppose to be a reference to the element instead of a copy?

Let me show you the same piece of code to you written in Javascript and Ruby.

In Javascript and Ruby,  aVeryLongNameWhichHoldANestedArray is  [["Hi"]], which to me is a lot more reasonable.

Let’s take a look at something similar

So now, what will it print? It often surprises you again. It prints  [stdClass Object([hi] => Hi)].

Pitfalls of languages

No programming language is perfect when it applies to software engineering. Even C has some pitfalls after carefully standardised by ANSI. The fallthrough switch is one of them, which is considered a feature during standardisation, but it isn’t used most of the time – programmers just auto insert breaks after each case. Unluckily, PHP inherits that too.

Private Constructor

Few years ago

I recalled a story today. When I was studying university, I asked my Professor who taught Java:

What is the use of private constructor?

He didn’t have the answer. We thought of a few possibilities but weren’t so sure about them at the end of the day.

The Private Constructor

We cannot instantiate MyClass outside it. new MyClass() can only be called inside  MyClass. Is it useless?

The Singleton Pattern

Turns out the private constructor is useful for the singleton pattern to avoid other classes instantiating the singleton on their own.

From this, we can restrict the  MySingleton class instantiates at most once throughout the run time of the application.


In bridge, it is also called Singleton if the hand contains only 1 card of a suit.


No, I am not anti-PHP, not in that way.


I often talk casually about how bad PHP is and people often misunderstood that I am a PHP hater. In fact, I don’t hate PHP and even use PHP in some of my projects. I just feel that PHP is a sub-optimal choice in most situations.

PHP has its long history and popularity in web pages and applications over any other languages. In recent years, it has been frequently compared to Ruby and Python, saying that PHP is inferior in its programming language design. While most of the criticisms are valid, I still think there are situations and reasons to use PHP over other languages.

2 reasons for me to use PHP

1) The skill set of your team

If the developers in your team has little experience in other languages, forcing them to code in a new language can be very risky to your project. Even if you have great developers, it could take half a year for them to feel comfortable in a new language. For less skillful developers who are familiar with PHP, if they write Java, they would still be limited by the mind set of PHP. Eventually they will be coding PHP in Java.

In case you are in a startup company working for the prototype, avoid going for a new technology. Otherwise you will end up in wasting time in familiarizing yourself with INSERT_YOUR_COOL_LANGUAGE_HERE.

2) Simple web pages

The way that PHP originally designed favors plain HTML in .php files. PHP can be a great choice if the majority of your content is static. For example, if you want to work on a few web pages which shares the same header, you can simply do this in PHP.






Achieving the same simplicity in other languages may require an additional library or a light weight framework. Doing this in a heavy full stack framework such as Rails or Django? No thanks.

Also, setting up a PHP server is relatively simply and straight forward. There are also packages that provides a ready-to-use LAMP (Linux, Apache, MySQL, PHP) platform for production.

Invalid arguments that PHP is the choice

1) Facebook uses PHP too!

Having some giant companies that use PHP doesn’t mean it is suitable for you. Facebook, WordPress and Wikipedia could have invested a lot to overcome the drawbacks of using PHP. Facebook puts a lot of effort to optimize PHP to handle 1 billion active users. They even tried to compile PHP into C++ to take advantage of the gcc compiler optimisation. You may also want to dive into the code of WordPress as it’s available to everyone. It can take you quite a while to understand it.

2) It doesn’t matter if you are a good developer

There are some fundamental properties in the programming language that you can’t change not matter how good you are. For example, you can’t change the inequality operator <>  in Pascal even if you think it is better to use !=  as in C. You cannot change C into a dynamically typed language when your application needs it.


The majority of programmers like to describe programming languages as tools. But in my opinion, they are more like materials, especially in software engineering. Once you decided to use bricks to build a house, your designs are limited within the bricks. You have to consider the weight, the durability, and the cost of the bricks when drawing the blueprints. Later on if you want to rebuild the roof with some other material, you still have to consider the properties of the bricks, whether the brick walls can withstand the weight of the roof that you build with the new material.

Similarly, if you choose PHP for building your application, you will live within it. You have to beware of its relative slow performance when comparing to other compiled languages if you are building high usage real-time application. Also, PHP may not be the choice if you are writing multi-thread application since it is not designed for it. Later on if you want to optimize your application by rewriting the ORM in C to throw in tons of low level optimization, you still have to consider the interface between PHP and C.


Despite the fact that I consider PHP as a sub-optimal choice for most cases, I still see PHP is suitable for some cases, especially in small web sites. However, don’t take successful companies that use PHP as a proof that PHP is better. Remember your building material always matters. Maybe you agree to none of my statements here. That’s fine. But the very bottom-line is, be conscious to your decisions and convince yourself why you’d make that choice with valid arguments.

What if Ruby adopt Python style indentation?

Python’s most famous feature

One of the famous features in Python is its semantic indentation. Here’s a code snippet from Python official documentation:

There are no “end”s, no curly brackets. Python uses indentations to group statements, which is one of the features I like Python over Ruby. Here’s the identical code written in Ruby:

If Ruby adopt Python style indentation, the Ruby version can save 4 lines, which will become something like this:

Curly Brackets are less noticeable

Indentations are used to improve readability in the source code. But since then, curly brackets remain exist for compilers. Programmers don’t really pay attention to them when reading codes. Have you ever tried fixing the missing closing brackets? Your code looks completely fine when you scan through the structure, but that missing bracket screw the compilation.

Repeat yourself with that indentation + curly brackets

By using indentation and curly brackets together, we are violating the DRY rule – Don’t repeat yourself. In order to follow the rule, you need to pick either indentation or curly brackets. I would pick indentation over curly brackets because of the visual benefit.

We interprete the code by indentation

The the famous “if-else” pitfall of C code, which seems to be fine at first sight. But it turns out that in case Country A is friendly, when president has some time, he will make a phone call to Country A. If president doesn’t have time, he will bomb Country A.

Turns out organising blocks by indentation can solve this problem. Since organising blocks by indentation is more intuitive, less faults will be made by programmers even if he is somehow careless or less familiar with the language syntax.


Why write more if you can write less? Some programmers argue this limits the way they can organise the code, but I would say it gives more benefit than its cost. After all, all programming languages limit the coder in some way. Another language which follows the Python indentation syntax is F#, which is developed by Microsoft.  I predict more programming languages will adopt this syntax in the future.

Annoying SSH brute force attack from zombies

The problem

If you have ever checked on your SSH access log, you will find a lot of login attempts like this:

On this server, I want to check how frequent the attempts are so I type in the shell:

I haven’t excluded my connections here because it was only a few. It shows that my server had 1545 SSH disconnects on 3rd of May, I received an SSH login attempt per minute on average.

There are a lot of tips to secure your SSH server out there already so I am not going to repeat them here. Theoretically speaking, the attacker will have no chance to access your system if your password is long enough. For a random 10 character alpha-numeric password, there’s only 1% chance to break in after 229 million years if the attacker try 10000 times per day. It is also a good idea to enforce RSA keys on a multi users system.

Still, it is annoying.

Although it’s impossible for attackers to break in a secured server, I’m annoyed. Most of these attacks come from zombie networks, and the real hacker is behind them hiding so you can’t really do anything about it. There’s almost no cost for each SSH attempt so they will do it 24/7.


Consider increasing the cost for failed attempt?

I was thinking of a way to increase the cost for SSH attempt after a fail attempt, which is controlled by a new SSH protocol. The server can generate a factorisation problem for the client, and then double the difficulty of the problem after each fail attempt. Would this kind of protocol drastically decrease the throughput of the brute force attack? Feel free to put your 2 cents in.

If your code doesn’t solve a problem, it creates more.

I have already been working on startup projects for about 2 years. Startup software engineering is not easy because the requirements change a lot, and they change quickly. The code that is written today might become useless next week, sometimes even worse that it hasn’t been used at all.

Through out the development process, we are constantly implementing new features for the new requirements. This process is often limited by the existing designs including the database design and system architecture  You either need to dive in and make a lot of changes, or commit to the current design and work on a hackish patch to provide that feature. This makes me think, “It would be much better if I didn’t write that code at the very beginning.” If your code doesn’t solve a problem, it creates more. The requirements will eventually change and smack you into your face.


Ancient soldiers don’t carry the most powerful weapons or armours to war, because they are heavy. Most prefer leather armours or chain mails over plate mails, it is much lighter and also cheaper. In fact, soldiers travel over 90% of the time during wars. A more powerful gear could have exhausted them before they fight. Similarly in software engineering, you need to maintain your code base all the time. You want to be lightweight and swift so that you can be ready for the requirement changes.

Simple modular design in Unix is probably why it is still widely used nowadays (Of course, the free, open source is also a major factor).

Related reading: