New home page

chashchin.net

Just a snapshot of my current train of thoughts. Written in Notepad++, old-school style :)

Posted in Game development | Comments Off

Proactor

http://en.wikipedia.org/wiki/Proactor_pattern

Here is the “single-threaded” implementation in C#, which basically is useless but shows the structure of the pattern.
Any activity we perform implements this interface

public interface IActivty
{
    void Perform();
}

When the activity is completed we are going to receive a callback

public delegate void CompletionDelegate(IActivty activity);

We have a single Processor that actually does the job. In reality it will do something smart like allocating a thread pool and queuing activities but here we just “do the job”

public class Processor
{
    private Dispatcher _dispatcher = new Dispatcher();
    public void Submit(IActivty activity, CompletionDelegate callback)
    {
        _dispatcher.Register(activity, callback);
        activity.Perform();
        _dispatcher.Complete(activity);
    }
}

The dispatcher is the other piece of the puzzle. It will do the callbacks when the activity is completed. It can be done asynchronously, maybe handle errors, do some other magic like multiple attempts to call back, but yet again let’s just make the call

public class Dispatcher
{
    private Dictionary<IActivty, CompletionDelegate> _callbacks = new Dictionary<IActivty, CompletionDelegate>();
    public void Register(IActivty activity, CompletionDelegate callback)
    {
        _callbacks.Add(activity, callback);
    }
    public void Complete(IActivty activity)
    {
        CompletionDelegate callback = _callbacks[activity];
        if (callback != null)
            callback(activity);
        _callbacks.Remove(activity);
    }
}

The interesting idea is to ensure that the callback is made on the same thread as the Submit to the Processor. That ensures that the whole construct looks single-threaded from the caller’s perspective.

Let’s see how it all works. The task is to read user’s input and output it back. We have 2 activities:

public class Read : IActivty
{
    public string Input { get; private set; }
    public void Perform()
    {
        this.Input = Console.ReadLine();
    }
}
public class Write : IActivty
{
    public string Output { private get; set; }
    public void Perform()
    {
        Console.WriteLine(this.Output);
    }
}

Then the main program will look like this

    static Processor processor = new Processor();
    static void Main(string[] args)
    {
        processor.Submit(new Read(), new CompletionDelegate(OnRead));
    }
    static void OnRead(IActivty activity)
    {
        processor.Submit(new Write() { Output = ((Read)activity).Input }, null);
    }

Honestly all these troubles are for the best when it comes to asynchronous calls. Besides that’s what is used in boost:asio
But more importantly it fits very nicely into the enterprise architecture.

Posted in Design Patterns | Comments Off

Generate GUID in VBA

Public Function GetGUID()
  GetGUID = Mid(CreateObject(“Scriptlet.TypeLib”).GUID, 2, 36)
End Function

Posted in Useful Code | Comments Off

Common prefix for the list of strings

string GetCommonPrefix(string[] list)
{
  string startString = list[0];
  int pos = 0;
  foreach (char c in startString)
  {
    for (int i = 1; i < list.Length; ++i)
      if (list[i].Length == pos || list[i][pos] != c)
        return pos == 0 ? "" : startString.Substring(0, pos);
    ++pos;
  }
  return startString;
}

Posted in Useful Code | Comments Off

Upload file to FTP (C#)

http://msdn.microsoft.com/en-us/library/ms229715.aspx

This example works for one file. If you loop it and try to upload lots of big files you will likely get 550 File Unavailable error. Here how to solve the problem:

FtpWebRequest ftp = (FtpWebRequest)WebRequest.Create(ftpUrl + fileName);
ftp.UseBinary = true;
ftp.KeepAlive = false;
ftp.Method = WebRequestMethods.Ftp.UploadFile;
ftp.Credentials = new NetworkCredential("login", "password");
using (Stream requestStream = ftp.GetRequestStream())
{
  using (FileStream fileStream = new FileStream(fileFolder + fileName, FileMode.Open))
  {
    byte[] buffer = new byte[4096];
    int count;
    while ((count = fileStream.Read(buffer, 0, buffer.Length)) > 0)
      requestStream.Write(buffer, 0, count);
  }
}
using (FtpWebResponse response = (FtpWebResponse)ftp.GetResponse())
  Console.WriteLine("Upload File Complete, status {0}", response.StatusDescription);

Posted in Useful Code | Comments Off

Custom memory management (implementation)

I reinvented the wheel… yet again. But the fun is not in the result but in the process. Here is how I designed the custom memory management solution:

  • Class MemoryManager creates a set of heaps. Each heap holds objects with sizes that are equal or less than the power of 2 to avoid memory fragmentation. The class allocates and frees memory by determining the size of the object and then delegating the management to the right heap. The smallest object size is equals to the size of a pointer, which I hard-coded as sizeof(unsigned). The total number of heaps is defined by a constant. It shouldn’t be very large so we don’t waste lots of memory due to de-fragmentation overhead. Big objects should also be managed, but in a different way, especially strings and other game resources. But I’ll think about them when I get to Resource Manager implementation. Each heap initial size can be fine-tuned for optimal performance.
#define NUMBER_OF_HEAPS 15
#define HEAP_SIZE 0x8000
static const unsigned OFFSET = Algorithms::BinaryLogarithm(sizeof(unsigned));
class MemoryManager
{
	Heap* _heaps[NUMBER_OF_HEAPS];
public:
	MemoryManager()
	{
		_heaps[0] = new Heap(sizeof(unsigned), HEAP_SIZE);
		for (int i = 1; i < NUMBER_OF_HEAPS; ++i)
			_heaps[i] = new Heap(2 << (i + OFFSET - 1), HEAP_SIZE);
	}
	~MemoryManager()
	{
		for (int i = 0; i < NUMBER_OF_HEAPS; ++i)
			delete _heaps[i];
	}
	void* Allocate(size_t size)
	{
		unsigned index = Algorithms::BinaryLogarithm(size) - OFFSET;
		if (index < NUMBER_OF_HEAPS)
			return _heaps[index]->Allocate();
		else
			return ::operator new(size);
	}
	void Free(void* address, size_t size)
	{
		unsigned index = Algorithms::BinaryLogarithm(size) - OFFSET;
		if (index < NUMBER_OF_HEAPS)
			_heaps[index]->Free((unsigned char*)address);
		else
			::operator delete(address);
	}
};
  • The heap initially obtains a handle to a native automatically growing Windows heap of a predefined size from which smaller chunks are allocated and linked into a list.
    MSDN states that if one calls ::HeapCreate(HEAP_NO_SERIALIZE, size, size) then the subsequent call to ::HeapAlloc(handle, 0, size) will fail as some space will be reserved for an internal structure. But what size is it? Apparently ,it’s just 1 byte (at least in Windows 7, 64 bit).
    The process of allocation is simple: we ask each chunk to allocate required memory and if they all can’t we create a new one. To free memory we find the chunk that owns it and request to free it.
class Heap
{
private:
	size_t _slotSize;
	size_t _heapSize;
	HANDLE _heapHandle;
	Chunk* _topChunk;
public:
	Heap(size_t slotSize, size_t heapSize) :
		_slotSize(slotSize),
		_heapSize(heapSize),
		_heapHandle(NULL),
		_topChunk(NULL){}
	  ~Heap()
	  {
		  while (_topChunk != NULL)
		  {
			  Chunk* next = _topChunk->_next;
			  if (_heapHandle != NULL)
				  ::HeapFree(_heapHandle, 0, _topChunk->_chunkAddress);
			  delete _topChunk;
			  _topChunk = next;
		  }

		  if (_heapHandle != NULL)
			  ::HeapDestroy(_heapHandle);
	  }
	  void* Allocate()
	  {
		  if (_heapHandle == NULL)
		  {
			  _heapHandle = ::HeapCreate(HEAP_NO_SERIALIZE, _heapSize + 1, 0);
			  _topChunk = new Chunk(_slotSize, _heapSize, _heapHandle);
		  }

		  void* address = NULL;
		  for (Chunk* chunk = _topChunk; address == NULL; chunk = chunk->_next)
		  {
			  address = chunk->Allocate(_slotSize);
			  if (address != NULL)
				  break;
			  if (chunk->_next == NULL)
				  chunk->_next = new Chunk(_slotSize, _heapSize, _heapHandle);
		  }

		  return address;
	  }
	  void Free(unsigned char* address)
	  {
		  for (Chunk* chunk = _topChunk; chunk != NULL; chunk = chunk->_next)
			  if (address >= chunk->_chunkAddress && address < chunk->_chunkAddress + _heapSize)
			  {
				  chunk->Free(address, _slotSize);
				  return;
			  }
	  }
};
  • Chunks are separated into slots. I tried lots of different ways to managed them. First, I held a separate bit array to see what slots were free/used. Then I tried to allocate the last byte in each slot to hold this information. But each time the performance was appalling. The truth is, if Allocate function does any kind of calculations, you are dead. It shouldn’t do anything but to return the next available slot. The solution was to organize slots into a linked list. But how do we store the list? Since we have all those slots why not use them? So each slot will contain either some object or a reference to the next available slot. Simples!
class Chunk
{
	friend class Heap;
private:
	Chunk* _next;
	unsigned char* _chunkAddress;
	unsigned* _freeSlot;
public:
	Chunk(size_t slotSize, size_t heapSize, HANDLE heapHandle) : _next(NULL)
	{
		_chunkAddress = reinterpret_cast(::HeapAlloc(heapHandle, 0, heapSize));
		_freeSlot = reinterpret_cast(_chunkAddress);

		unsigned* address;
		unsigned char* p = _chunkAddress;
		for (unsigned i = 0; i < heapSize / slotSize; ++i)
		{
			address = reinterpret_cast(p);
			p += slotSize;
			*address = reinterpret_cast(p);
		}
		*address = 0;
	}
	void* Allocate(size_t slotSize)
	{
		if (!_freeSlot)
			return 0;

		unsigned char* p = reinterpret_cast(_freeSlot);
		_freeSlot = reinterpret_cast(*_freeSlot);
		return p;
	}
	void Free(unsigned char* address, size_t slotSize)
	{
		unsigned* p = reinterpret_cast(address);
		*p = reinterpret_cast(_freeSlot);
		_freeSlot = reinterpret_cast(p);
	}
};

Lots of work and for what? Well, apart from solving memory fragmentation problem, the performance is not bad either

static MemoryManager GlobalMemoryManager;

class Complex
{
private:
	double _real;
	double _imaginary;
public:
	Complex (double real, double imaginary): _real(real), _imaginary(imaginary) {}

	void* operator new(size_t size)
	{
		return GlobalMemoryManager.Allocate(size);
	}
	void operator delete(void* address, size_t size)
	{
		GlobalMemoryManager.Free(address, size);
	}
};

int _tmain(int argc, _TCHAR* argv[])
{
	HighResolutionTimer<1000> timer;
	timer.Start();
	Complex* array[1000];
	for (int i = 0; i < 5000; i++)
	{
		for (int j = 0; j  < 1000; j++)
			array[j] = new Complex (100, 200);
		for (int j = 0; j <  1000; j++)
			delete array[j];
	}

	LONGLONG test = timer.Sample();
	std::cout << "Test: " << test << "ms" << std::endl;

	return 0;
}

The result of the test is 56ms. If I comment new and delete overrides in Complex class, thus relying on standard memory manager implementation, the result will be 590ms. So my solution is about 10 time faster. Hell, yeah!

Posted in Memory Management | Comments Off

Binary Logarithm

// Proudly stolen from http://graphics.stanford.edu/~seander/bithacks.html
inline unsigned BinaryLogarithm(unsigned x)
{
	static const unsigned int b[] = {0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000};
	static const unsigned int s[] = {1, 2, 4, 8, 16};

	register unsigned int r = 0;
	for (int i = 4; i >= 0; i--)
	{
		if (x & b[i])
		{
			x >>= s[i];
			r |= s[i];
		}
	} 
	return r;
}
Posted in Useful Code | Comments Off

Round to power of 2

Here is the fastest way to calculate it (no kidding):

static const unsigned bit[0x100] = 
{
	1, 2, 4, 4, 8, 8, 8, 8, 16, 16, 16, 16, 16, 16, 16, 16,
	32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
	64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
	64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64,
	128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
	128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
	128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
	128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128, 128,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256,
	256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256, 256 
};
inline unsigned __stdcall RoundToPowerOf2(unsigned x)
{
	--x;
	int shift = 0;
	if( x & 0xFFFF0000)
	{
		x >>= 16;
		shift = 16;
	}
	if( x & 0xFF00)
	{
		x >>= 8;
		shift += 8;
	}
	return bit[x] << shift;
 }
 
Posted in Useful Code | Comments Off

High-resolution timer

template <unsigned int T> class HighResolutionTimer
{
private:
	HANDLE _thread;
	DWORD_PTR _mask;
	LARGE_INTEGER _frequency;
	LARGE_INTEGER _start;
public:
	HighResolutionTimer()
	{
		_thread = ::GetCurrentThread();
		_mask = ::SetThreadAffinityMask(_thread, 1);
		::QueryPerformanceFrequency(&_frequency);
	}
	~HighResolutionTimer()
	{
		::SetThreadAffinityMask(_thread, _mask);
	}
	void Start()
	{
		::QueryPerformanceCounter(&_start);
	}
	LONGLONG Sample()
	{
		LARGE_INTEGER end;
		::QueryPerformanceCounter(&end);
		return (end.QuadPart - _start.QuadPart) * T / _frequency.QuadPart;
	}
};

The template parameter should be set to 1000 for results in milliseconds or 1000000 – in microseconds.

And here is the test that shows why it is a bad idea to access array in the wrong order:

#define SIZE 100
int ar[SIZE][SIZE][SIZE];

int _tmain(int argc, _TCHAR* argv[])
{
	HighResolutionTimer<1000> timer;

	timer.Start();
	for (int i = 0; i < SIZE; ++i)
		for (int j = 0; j < SIZE; ++j)
			for (int k = 0; k < SIZE; ++k)
				ar[i][j][k] = 0;

	LONGLONG test1 = timer.Sample();

	timer.Start();
	for (int i = 0; i < SIZE; ++i)
		for (int j = 0; j < SIZE; ++j)
			for (int k = 0; k < SIZE; ++k)
				ar[k][j][i] = 0;

	LONGLONG test2 = timer.Sample();

	std::cout << "Test1: " << test1 << "ms" << std::endl;
	std::cout << "Test2: " << test2 << "ms" << std::endl;
	std::cout << "Diff: " << (test2/test1 - 1) * 100.0 << "%" << std::endl;
	return 0;
}
Posted in Useful Code | Comments Off

Display current time

TCHAR timeBuffer[9];
_tstrtime_s(timeBuffer, 9);
_tprintf(_T(“%s”), timeBuffer);

Posted in Useful Code | Comments Off