Author Topic: Autopatcher going very slow with multiples clients  (Read 18057 times)

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Autopatcher going very slow with multiples clients
« on: December 19, 2011, 04:20:04 AM »
I have a strange performance problem with the autopatcher, that may clearly come from a misuse.
First, configuration : Autopatcher with a postgre database (basic configuration) used in TCP working on an Ubuntu server. Patches are of 2 kinds : Small (1MB) and big (450MB).
When launching one AutopatcherClient there is no problem, but if I launch 4 or 5 of them, suddenly, everyone gets stuck. When looking at Autopatcher logs, I see "ID_NEW_INCOMING_CONNECTION" and it takes ages to get the log "IP|25712 GetChangelist complete. Delete files. 0 queued. 4 working". And all the clients technically downloading are stuck in their download.
I know you don't advise using the Autopatcher for big patches, but is it the only reason generating that problem ?

Apart from that, I have problems with ports. When I stop the Autopatcher, it doesn't close ports. So, I need to wait for a minute or 2 before launching it again.

Edit : I forgot to say I added a RakSleep(10) in ThreadPool::WorkerThread, if it can have an impact.
« Last Edit: December 19, 2011, 06:25:43 AM by SuperBidi »

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #1 on: December 19, 2011, 06:38:07 AM »
I have quite some "GetPatch complete. Aborted from download thread. 0 queued. 3 working."
And after having it, lots of logs, then nothing once again.
If it can help.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #2 on: December 20, 2011, 12:24:51 PM »
I have that : GetChangelist added. 1 queued. -1 working.
Does not look like a normal log.

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #3 on: December 20, 2011, 11:10:09 PM »
What version is this? It was a while ago but I added a lot of performance improvements.

From a first read it looks like your problem is in accessing the database to get the list of changes. Are you doing a full scan from the clients, or getting files since a certain date? You might also want to profile AutopatcherPostgreRepository::GetFilePart using RakNet::GetTime() to see how many milliseconds the calls are taking.

About how many files did you upload to the server, and what is the average and largest sizes?

The autopatcher doesn't close the TCP connection for you. You have to do that yourself from the client TCPInterface.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #4 on: December 21, 2011, 05:10:28 AM »
It's the 4.012 version.

I do a full scan from clients (as we are in fact testing if the Autopatcher can handle a full installation). The big update contains 3000 files, biggest should be around 15M. The smallest contains 7 files, biggest around 6M.
When it is stuck, the AutopatcherClient is at 100% of a CPU. AutopatcherServer does not consume any abnormal resources.
The small update is stuck as often as the big. I even got a small update stuck when there was noone else on the server.

I've benched AutopatcherPostgreRepository::GetFilePart around the lock. It takes around 20-30ms. Sometimes, it goes as far as nearly 2 seconds. But very rarely. I haven't felt it to be so slow. The database is installed on the same machine, to avoid big network transmissions.

About TCPInterface, I'm of course calling packetizedTCP.Stop() before leaving application.

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #5 on: December 21, 2011, 10:55:31 AM »
Please verify that in AutopatcherServer::Update(), it is calling OnGetChangelistSinceDateInt() for all clients right away. If so, confirm that GetChangelistSinceDateCB is also called 4 times in quick succession, which it should be as it should be running each call in a separate thread. If that is also true, then it might be stuck in AutopatcherPostgreRepository::GetChangelistSinceDate. You could halt all threads and look at the callstack to see where it is stopped.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #6 on: December 23, 2011, 10:15:36 AM »
I've logged OnGetChangelistSinceDate, GetChangelistSinceDateCB and AutopatcherPostgreRepository::GetChangelistSinceDate.
The time you will see in logs is the benchmarking of AutopatcherPostgreRepository::GetFilePart.

I've got this one by killing an AutopatcherClient (when it says aborted) :

System ready for connections
(C)reate database.
(A)dd application
(U)pdate revision.
(R)emove application
(Q)uit
ID_NEW_INCOMING_CONNECTION
OnGetChangelistSinceDate
195.101.197.218|14304 GetChangelist processing. 0 queued. 1 working.
GetChangelistSinceDateCB 6
195.101.197.218|14304 GetChangelist complete. Delete files. 0 queued. 0 working.
195.101.197.218|14304 GetPatch processing. 0 queued. 1 working.
Time 853
195.101.197.218|14304 GetPatch complete. Aborted from download thread. 0 queued. 0 working. //AutopatcherClient killed
ID_CONNECTION_LOST
Time 252
Time 217
195.101.197.218|14304 GetPatch complete. Files pushed for patching. 0 queued. -1 working.
ID_NEW_INCOMING_CONNECTION
195.101.197.218|14573 GetChangelist added. 1 queued. -1 working.
ID_NEW_INCOMING_CONNECTION
195.101.197.218|14850 GetChangelist added. 2 queued. -1 working.

It was impossible for any future client to download anything.

Another example, I launch 6 clients at a time (nearly), and I've got that as logs :
System ready for connections
(C)reate database.
(A)dd application
(U)pdate revision.
(R)emove application
(Q)uit
ID_NEW_INCOMING_CONNECTION
OnGetChangelistSinceDate
195.101.197.218|36164 GetChangelist processing. 0 queued. 1 working.
GetChangelistSinceDateCB 1
195.101.197.218|36164 GetChangelist complete. Delete files. 0 queued. 0 working.
ID_NEW_INCOMING_CONNECTION
195.101.197.218|36164 GetPatch processing. 0 queued. 1 working.
Time 220
OnGetChangelistSinceDate
195.101.197.218|36173 GetChangelist processing. 0 queued. 2 working.
GetChangelistSinceDateCB 2
195.101.197.218|36173 GetChangelist complete. Delete files. 0 queued. 1 working.
Time 1975
ID_NEW_INCOMING_CONNECTION
OnGetChangelistSinceDate
195.101.197.218|36181 GetChangelist processing. 0 queued. 2 working.
OnGetChangelistSinceDate
195.101.197.218|36195 GetChangelist processing. 0 queued. 3 working.
GetChangelistSinceDateCB 3
GetChangelistSinceDateCB 4
195.101.197.218|36195 GetChangelist complete. Delete files. 0 queued. 2 working.
195.101.197.218|36181 GetChangelist complete. Delete files. 0 queued. 1 working.
ID_NEW_INCOMING_CONNECTION

And here, I wait something like 1 minute, without having the 3 last connections opened. Suddenly, it will unstuck itself, and the connections will arrive.
During that time, CPU is at 1%, memory is normal, it looks like it's just waiting.
I find weird that I get "GetChangelistSinceDateCB 4" before the fourth ID_NEW_INCOMING_CONNECTION.
I precise that I created 10 worker threads to handle these 6 simultaneous clients.

I never reach AutopatcherPostgreRepository::GetChangelistSinceDate as I'm doing the first download.
If you have any kind of insight, it could be usefull to me before I go into a deeper debugging session.
« Last Edit: December 23, 2011, 10:23:32 AM by SuperBidi »

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #7 on: December 24, 2011, 10:45:41 PM »
OK the problem, or at least part, was that AutopatcherServer::RemoveFromThreadPool did not account for threads currently processing, only threads that have not yet processed. Let me know if the attached file at least fixes the first part of your problem, where it stops patching.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #8 on: January 02, 2012, 06:29:06 AM »
I no more manage to block it completely, but I still have speed issues.
When there are few people updating (even one is enough), new connections are really slow to handle. The problem is that an Autopatcher is something you launch at application startup and most of the times it just checks you've got the last version and don't update anything. These clients must go through the update process very quickly.
Is there a way to give higher priority to connection/version checking than to updating ?
Or, eventually, to manage more players at a time. I've increased the number of working threads, but it does not seem it has changed anything.
« Last Edit: January 02, 2012, 09:12:25 AM by SuperBidi »

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #9 on: January 02, 2012, 10:38:20 AM »
You should use the last update date parameter when possible, and not do a full scan unless the user explicitly requests it. You can even store the last update date with the client installer, using the value from the server at the time you create the installer.

You should still be able to do multiple full scans at the same time though. In order to resolve that, you'll need to run multiple full scans and stop the threads, to see which function in AutopatcherPostgreRepository they are taking so long in. My guess is AutopatcherPostgreRepository::GetFilePart, which would mean the database is slow to respond. You could solve that with a faster harddrive on the server.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #10 on: January 02, 2012, 12:21:44 PM »
I think I need to explain more what I do.
My test :

First, I launch an AutopatcherClient that performs a full scan.

ID_NEW_INCOMING_CONNECTION
195.101.197.218|30155 GetChangelist processing. 0 queued. 1 working.
195.101.197.218|30155 GetChangelist complete. Delete files. 0 queued. 0 working.
195.101.197.218|30155 GetPatch processing. 0 queued. 1 working.

At that stage, I launch an AutopatcherClient with everything downloaded, and that sends the date of previous update.
One file (5 seconds) gets downloaded on first client.

ID_NEW_INCOMING_CONNECTION
195.101.197.218|30170 GetChangelist processing. 0 queued. 2 working.
195.101.197.218|30170 GetChangelist complete. No files in changelist. 0 queued. 1 working.

5 files get downloaded on first client (around 25 seconds).

ID_CONNECTION_LOST

Finally, the second client has finished its update check, which takes around 1 second when I launch it alone.
I doubt it is a database problem, because most of the time spent (these 30 seconds) are due to my network connectivity. The total file size is around 10Mb, which should just take around 1 seconds to extract from database (and I got this speed when I time GetFilePart).
Unfortunately, I have a very limited dev environment on this server. Basically, g++. So, I'll try to look on my own what takes time.

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #11 on: January 02, 2012, 02:05:58 PM »
This is with TCP right? In TCPInterface.cpp, can you try changing this:

Code: [Select]
// Sleep 0 on Linux monopolizes the CPU
RakSleep(30);

To actually use 0 and see if that helps?

You can also try just sending some data using TCPInterface from one computer to another and see if that takes a long time. The test DirectoryDeltaTransfer does this, so you can just try that test.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #12 on: January 03, 2012, 03:06:39 AM »
RakSleep(0) does not change anything.
This morning, we had network problems, so I tested it in worse internet connection, and it's worse. For me, it comes from thread logic. The thread accepting new connections and handling the second autopatcher waits for the first thread to complete its data transfer.
About data transfer, if I just have one client, it is ok. So, testing DirectoryDeltaTransfer from one computer to another is useless.
I'll try with multiple clients (looks like it can), but I'm pretty sure it does not come from database, as my test this morning shows it is dependant on my network connectivity.

SuperBidi

  • Not-a-newbie
  • *
  • Posts: 47
  • Karma: 0
    • View Profile
Re: Autopatcher going very slow with multiples clients
« Reply #13 on: January 03, 2012, 05:24:44 AM »
I 'solved' the problem with that line :
//#define USE_TCP

I'll start stress tests, even if I can't do that much as I can't really stress it with few machines.
« Last Edit: January 03, 2012, 05:29:32 AM by SuperBidi »

Rak'kar

  • Administrator
  • Hero Member
  • *****
  • Posts: 6895
  • Karma: 291
    • View Profile
    • RakNet
Re: Autopatcher going very slow with multiples clients
« Reply #14 on: January 03, 2012, 11:39:27 AM »
If it works when you do not use TCPInterface, the problem is possibly the thread priority you assigned when you called TCPInterface::Startup()