Wednesday, December 14, 2016

Tankless Water Heaters (...or my life changed from van to homeowner)

Now we're building a house.  It's on a steep hill in Chattanooga.  More details on that later, but for now I'm contemplating tankless water heaters.  After hours of research, I settled on a single, central unit, instead of three point-of-load units.  The main driving factor was cost, particularly since we'd need large heaters at all three locations; each location has a large water demand, although typically only one location needs that demand at a time... thus, with one large water heater, we can meet large demand at all three locations (but not at the same time).  With three separate heaters, we'd need to support that large demand at each site... hence, one central water heater is three times cheaper.  It's possible we could have slightly undersized the units if installed at each site, but the price difference was negligible.

So we're building one tankless water heater.  After much more research, I ruled out "ECOsmart" brand heaters, because they have long term reliability issues.  Long story short, it looks like Stiebel Eltron units last for 10-20 years, and they've been doing tankless heaters in Europe for more than 50 years.  They are more expensive than units you'd find in big box stores, but they have several advantages:
-all internal piping is copper; cheaper units use plastic, which fails due to insane instantaneous heat load.
-no mechanical relays.  I hate relays.  Cheaper units use relays, whereas Stiebel's units use SSRs.
-True temperature modulation provides constant output temperature.

e-tankless.com sells factory refurbished units for a great discount.  The only difference is the warranty is only one year, but I'd much rather have a better built unit with less warranty, versus a shoddy design with a few more years of free replacements whenever a leak occurs.

So I settled on a refurbished Stiebel Eltron Tempra 29B unit, which consumes a whopping 29,000 watts under full load!  If wattage is a foreign concept to you, think about this: With that much power, you could charge Tesla's largest battery pack in under 3 hours... enough to drive 310 miles.  Put another way, if your pulled this much power 24/7, your power bill would be ~$2600/month.

That sounds expensive!  Well, it's not, because the tankless water heater is only on when you're actually using hot water.  Even when it's on, it's not running full tilt unless you're consuming hot water from all sinks/showers/etc at the same time.  Say you shower once per day for 10 minutes, wash dishes twice a week, run the clothes washing machine twice per week, and wash your hands a dozen times a day (as you should).  Under those conditions, you're only requesting hot water for maybe 30 hours/month, and you're maybe pulling 15 kW on average over that time.  That works out to a much more reasonable $50/month, which is probably still on the high side because I don't do any of the above nearly as often as listed ;), and I've been ultra-liberal on the power/water usage.

Since we're all used to tanked heaters, here are some notable differences a casual user will notice when using a tankless system:

A)-There's a minimum flow rate requirement before the tankless heater will heat the water.  If you pull less than this much water, then the water will pass through the heater unheated (i.e. it'll be cold).  For example, if you are washing dishes with the sink running slowly, the water is going to be cold.  We'll discuss ways around this in a minute.

B)-If you constantly turn the water on and off, the water coming out of the sink will be hot/cold/hot/cold... this is because the tankless heater turns off every time the water flow stops... hence the water in the piping after the heater will remain warm, but the cold water flowing through the heater when you turn the hot water back on won't get heated until the electrical heating elements heat back up.  We'll discuss a solution to this in a minute.

C)-They aren't truly "instant."  Nothing in an ideal world is instant.  From what I've read online, the water takes another 5-15 seconds to heat up compared to traditional water heaters.  Since traditional heaters keep water in a tank hot 24/7, as soon as you turn the faucet on, that hot water starts flowing through the pipes (which have cold water in them).  The time it takes to run the cold water through the pipes won't change, so the 5-15 second figure quoted above is in addition, while the heating elements are heating up.

D)- You probably won't use the cold water taps in the shower/sink/etc... you can set these units to any temperature you want, so why waste that heated water by mixing cold water?  The only reason tank water heaters require mixing is they must superheat the water so that the tank water doesn't rapidly cool below your desired temperature when cold water backfills the hot water that left the tank.  This isn't a problem, but is a noticeable difference.

So what's the solution?  We can solve A, B & C by adding an additional return water line that recirculates the water from the hot line to the water heater's cold input.  Note: This isn't allowed on all heaters!  The Stiebel Eltron Tempra units accept input water up to 131°F, so we're good (most people shower between 105-115°F).

Since we want hot water to get to each hot water zone as fast as possible, we use a hub-and-spoke design to distribute the hot water (pink) in 3/4" lines.  The (green) return lines are intentionally undersized (1/2"), so that the recirculating pump doesn't affect water pressure at any point-of-load (PoL) that's actually being used.  Undersizing the return lines also helps evenly distribute the recirculated water across all three zones.  You could use three separate pumps (one for each zone), but since the pumps are expensive (~$200), you're much better off using a single pump.

We could leave the recirculating pump running 24/7, but that would mean you'd lose tons of energy as the water cooled while traveling through the pipes.  This would easily add $50-100 to your energy bill each month, which is more power than you'd actually use.

Instead, we control the pump based on user demand.  There are a few control strategies:
1: Add a timer... for example, the timer runs from 8am to 11 am, then 5pm through 11pm.  That is still really wasteful, and doesn't fit my erratic schedule; I sometimes shower at 3am.  Of course, you could still get hot water whenever, you issues A,B & C still apply.

2: Add a button, or rather multiple buttons at each sink/shower/appliance.  Pushing any button starts the recirculating pump and keeps it running for 5 minutes (for example).  That solves A, B & C quite tidily, but requires the user to push the button... if they don't push the button, then A,B & C persist, but the user still (eventually) gets hot water.  It also prevent you from wasting all the cold water that would otherwise go down the drain while you waited for the water to heat up... simply push the button 30 seconds before you hop in the shower, start the dishwasher, clothes washer, etc.  It's important to reiterate that you don't need to push the button...

3: Add a water flow sensor and use it to enable the recirculating pump for 5 minutes each time some minimum flow rate occurs.  Note: The flow rate that enables the recirculating pump can be much lower than the flow rate that activates the water heater.  For example, if the water heater enables at 0.75 gpm, we can enable the pump at 0.25 gpm.  Since the pump would flow at 3 gpm (in my case), the 0.25 gpm flow will force 3 gpm through the heater, and thus it will stay on.  This solves A & B, but you'd still have to wait for the water to flow through the pipes, and for the heater to initially heat up.

4:  Implement options 2 & 3.  Then the user only needs to push the button to solve C (instant hot water as soon as you turn on the spout).  Option 2 takes care of A & B, and pushing the button takes care of C.

5: Do nothing.

...

I'm going to implement option 4, as it only requires one additional component (flow rate sensor).  In fact, it might not require the flow sensor at all, as the Stiebel Eltron already has a built-in flow rate sensor... if I can connect an ADC to its output, then no additional parts are required.

This is a perfect application for an Arduino Uno.  And hell, if it breaks there's no harm to the system... check valves always work.

...

So overall, this system only requires two extra components during construction:
-A return line from each heated water zone (kitchen, guest, master).
-A differential data line to each zone.

The other components can be added to an existing system:
-QTY2 3/4" NPT check valves.
-QTY1 1/2" NPT check valve per zone.  In my case, QTY3 total vales.
-QTY1 pump (3 GPM).
-QTY1 splitter 3/4" NPT to 3x 1/2" NPT
-buttons and arduino.

Sunday, February 28, 2016

Brick-It

Our last post let me easily verify my PCB layout.  I simply entered each pin name into the toggling code and probed to verify the signal went where I expected it to.  Sweet, no problems!

The next task was to use a faster master clock.  Many Atmel uCs have a built-in clock or two, but they're either really slow and/or terribly inaccurate.  Thus, many people opt to use an external oscillator.  In my application, I want to use an external clock (i.e. a 0:5 volt square wave generated elsewhere outside the chip, not by a crystal or resonator).  My uC is an Atmel Atmega64M1, which requires me to program 'fuses'* to use an external clock.
...

*This is a relic from eons ago, when small strips of metal were built into the CPU metal masks, and then programmers burned the desired 'fuses' with a high current pulse.  Obviously, these were one-time-write, but today the name 'fuse' persists even though they're reprogrammable (they're typically just a dedicated portion in the SLC Flash memory).   

...

So I go to program the fuses using a handy online calculator.  Afterwards, I'm no longer able to communicate with my uC, but I'm fairly certain it's running because the power consumption has increased by the expected amount (~5 mA), a hint that the faster clock is running throughout the chip now, whereas before it was blocked at the input.

After verifying the external clock met required input specifications, and that the clock was connected to the correct pin, I found the issue: the clock frequency of debugWIRE - the serial interface I used to program the fuses - is a divided down function of the CPU clock frequency.  Whereas before the clock frequency was 125 kHz or so, now it was running at 8 MHz (the external clock frequency).  Since debugWIRE communicates through the RESET pin (only), and since I had added a 10 nF capacitor to that pin, the serial data was getting corrupted.  So I drove up to my shop, took the capacitor off, and was back in business.

Next up I'm going to:
-figure out how to use the PSC PWM output module
-configure counter1 to output a 24 Hz clock, which is used by an exceptionally slow OEM data bus.
-configure interrupts and validate interrupt handling time.

Saturday, February 27, 2016

First Abstractions

In our last post, we discovered how to manually program hardware pins.  Our final code to continuously toggle a pin is:


    #include <asf.h>

    int main(void) {
        DDRB |= 1<<PORTB7;    //set output
        PORTB |= 1<<PORTB7;    //Set High
      
        while (1)
        {
            PORTB |= (1<<PORTB7);
            PORTB &= ~(1<<PORTB7);
        }
    }


...


Quick aside: By probing hardware pin PB7, I've concluded that '|=' is a single-cycle operator, while '&=' takes two clock cycles.  For example, the signal is high for 2 us, and low for 4 us.  Google didn't quickly answer why, so I'll look more into it later.


...


The first problem with the code above is we have no idea what "PORTB7" does in our system.  We could easily abstract the generic pin name to whatever it does in our system as follows:

    #define PhaseUL PORTB7


...which simply directs the preprocessor to substitute 'PhaseUL' with 'PORTB7'.  This helps us understand what pin we're toggling, but it doesn't make the code any simpler:


    #include <asf.h>

    #define PhaseUL PORTB7

    int main(void) {
        DDRB |= 1<<PhaseUL;    //set output
        PORTB |= 1<<PORTB7;    //Set low
      
        while (1)
        {
            PORTB |= (1<<PhaseUL);
            PORTB &= ~(1<<PhaseUL);
        }
    }

Keep in mind all we're trying to do is set a pin high and low.  The code above still requires us to know which port 'PhaseUL' is on... we could #define it, too:



    #define PhaseUL_Port PORTB

Which would make our code look like this:


    #include <asf.h>

    #define PhaseUL PORTB7
    #define PhaseUL_Port PORTB

    int main(void) {
        DDRB |= 1<<PhaseUL;    //set output
        PhaseUL_Port |= 1<<PORTB7;    //Set low
      
        while (1)
        {
            PhaseUL_Port |= (1<<PhaseUL);
            PhaseUL_Port &= ~(1<<PhaseUL);
        }
    }

In this same fashion, we could  #define anything else we wanted to make it easier to understand what signal we're changing, but it certainly doesn't make the code any easier.  Note that the above method would require two #defines per output pin!  Out of control!!!  Yuck.  There's got to be a better way.


...


Looking at the 64M1 gpio header files, it seems possible to use:

    gpio_set_pin_high(io_id)

 However, Atmel never explains what 'io_id' values to use, and if you follow the rabbit hole, the function calls into a black box.  No luck there, so I turned to the intertubes.  I didn't find an actual answer because the 8th reply to this discussion forum caught my eye: "gpio.h libraries are deprecated, and should be replaced with IOPORT.h".  Unsurprisingly, Atmel doesn't mention this in any documentation I've found, but hey, I'd previously wondered what 'IOPORT' was, so now I know.  My optimism tells me the glass is 1% full.


The IOPORT.h documentation has a quickstart guide and looks pretty straightforward.  After a few failed attempts, I arrived at the working code below:

    #include <asf.h>
    #define PhaseUL IOPORT_CREATE_PIN(PORTB,PORTB7)

    int main(void) {
        ioport_init();

        ioport_set_pin_dir(PhaseUL , IOPORT_DIR_OUTPUT);
        ioport_set_pin_low(PhaseUL);   

        while (1)
        {
            ioport_set_pin_high(PhaseUL);
            ioport_set_pin_low(PhaseUL);
        }
    }


This code is equivalent to the code further above*, but now we've used the ioport functions to abstract the actual pinout.  So why is this useful?  Now you don't need to know anything about the pin you want to toggle, except for the actual pin name.  We previously overcame this by #define(ing) three different variables to abstract each pin (pin name, pin port, pin direction).  Now we just use the descriptive ioport functions with our human-readable pin names. That's huge!

So what do we do now?  We need to define every pin we're using in the 'board_conf.h' file, so that it's not cluttering up our main loop.  Then we can just use the various ioport functions to manipulate our gpio pins. Finally, an ASF codebase that I've figured out how to use!

...

* (from above).  The code is equivalent, except that the pin is now high for 4 us (was 2 us), and low for 2 us (was 4 us).  In other words, the signal is inverted.  I suspect the ioport functions are writing to a different hardware register than PORTxn, but I'll look into it later.

Thursday, February 25, 2016

Atmel 8 bit GPIO 101 (a.k.a. Atmel for noobs)

I've spent the past month proactively vacuuming the internet for any and all useful Atmel programming guides.  Unsatisfied with what I found, I bought the textbook Embedded C Programming and the Atmel AVR and read it cover to cover in a day.

Throughout my edumacation, I've tried several times to actually create useful programs, but it's a long road to master programming, so I haven't progressed nearly as fast as I usually do with new engineering challenges.

My major complaint with programming is the information overload... each programming language has it's own problems and 'solutions'.  Even within a programming language, there are numerous 'dialects' , each of which takes time to wrap my head around.  Each programmer has his or her own programming style, resulting in numerous different architecture types.  In short, there's a ton to learn about how to implement each simple concept.

I'm increasingly frustrated that although I have the entire system created in state diagrams, notes, pseudocode, etc, I constantly find myself banging my head on the wall/desk/floor trying to create actual code. 

In comparison, electrical engineering is all about learning relatively few concepts, and then applying them in millions of ways.  There's no 'language' in hardware design... a MOSFET in Mexico is the same as in France... it always behaves according to the same fundamental principals, and with enough caffeine I can tell you exactly what a circuit does without scouring the internet for hours, trying to figure out what the engineer was trying to design.

software:offense :: hardware:defense 
90% of offense is memorizing numerous different plays and executing them perfectly, whereas 90% of defense is hitting the guy with the ball.  I preferred defense when I played.  Aside: I still don't regret quitting football when I did.

...

Down to business: Controlling GPIO 101 on Atmel Microcontrollers

At the lowest level, all a microcontroller does conceptually is turn pins on and off, and - usually not at the same time on a specific pin - measure if a pin is high or low.  Today, let's talk about how to program a microcontroller to read and write pins, which is otherwise known as 'GPIO' ("general purpose input output). 

I'm assuming you've used the internet to help you figure out how to install your preferred programming environment.  I don't have enough experience to 'prefer' anything, but I'm using Atmel Studio 7.0.  So far I hate it for many reasons, but I'm using it for now, until I understand it well enough to know which other IDE to use.

So you've got the software setup and have verified you can talk to the hardware using the built-in debugger tool.  Now what?  Now it's time to bit bang individual pins!

The first major concept to understand is that each pin on today's microcontrollers can do tons of different things.  Nearly every pin can be used as a low-level GPIO pin, but can also be used for dedicated hardware times, PWM outputs, serial buses, etc.  The actual additional functionality depends on your specific microcontroller.  Today we'll focus solely on GPIO.

When you're using a pin as GPIO, the code itself must manually read/write the pin each time you want to read/write data.  Compare this to using a pin as a simple counter, where the hardware measures the number of pulses received without requiring the CPU to increment a counter in software each time a pulse occurs.  Counters let the CPU do other things, whereas GPIO requires the CPU to continuously service a pin each time it wants to read/write a value.

Each time a uC is powered on, the default state of all pins is high impedance GPIO.  Once the code starts running, the pins can be configured however the programmer chooses.  Any pin that isn't specifically changed to something else will remain a high impedance GPIO.  I propose it's best to specifically program all pins, even if you're using their default behavior.

So how do we interface the hardware to the software?  If you look at your data sheet (I'm using the Atmega64M1 throughout this post), you'll find an exhausting - yet incomplete - description of the particulars.  If your head isn't spinning after the first 50 pages, check back once you get to page 1315 or so (the UC3C's hardware manual is actually this long).

So lets just pick a pin and work through programming it to do simple things:
physical hardware pin16, you're our lucky winner!
Looking at page 3, the text associated with pin16 is:
"(ADC5/INT1/ACMPN0,PCINT2) PB2"
Wow, that's a mouth full!

Each item inside the parenthesis is an alternate function besides the standard GPIO capabilities.  Thus, pin16 simplifies to:
"PB2"

...but what does "PB2" mean?
Microcontrollers don't individually address each pin.  Instead, multiple pins are placed in a "port".  Typically, the port is the same size as the actual architecture.  For example, there are typically 8 hardware pins in each port on an 8 bit uC.  Thus:
'P' is short for "port"
'B' signifies pin16 is part of port 'B'.
'2' means pin16 is the third* bit in port 'B'.
*ports are nearly always zero-indexed (i.e. the first element has index zero, the second element has index one, the third element has index two, etc).

So how do we pull pin16 low?  The manual explains GPIO in great detail (from page 51 to page 58).  To summarize, first we need to configure pin16 as an output:
 DDRB |= ( 1 << DDB2 );
Holy crap, did I lose you?  I was lost too, as there are a TON of concepts in that single line.  Let's break it down into the following sections.  But first, some background:

'x' refers to a specific port (pin16 is in port 'B', as discussed above).

'n' refers to a specific pin location inside a port (pin16 is at index '2' in port 'B', as discussed above).

'DDRx' is the "data direction register", which tells the uC whether each pin in the port is an input or an output.  DDRx contains 8 bits, one for each pin in port x.
'DDxn' is simply a hardware abstraction of a pin's location in the port.  Specifically, DDxn is simply a number.  For example, when the code sees 'DDB2', it searches to where 'DDB2' was defined, and ultimately replaces 'DDB2' with the number '2'.  It's okay to be a bit confused still.

'<<' is the left shift operator.  At a high level, the binary representation of the number on the left is shifted left by the number on the right.  Since 'DDB2' is on the right, and we know it's value is '2', the uC is going to shift '0b00000001' left two spaces.  Thus, "( 1 << DDB2 )" simply becomes "0b00000100".  FYI: "0b" in front of a number simply means the number is represented in binary.
So what is the '<<' doing?  It's creating a "bit mask," which is then used with some logical operator ('|=' in our example) to do something else.  It's okay to be more confused... this tacos gonna roll back around over on itself until we get all the fillings inside.

Okay,  so we've evaluated the right side ('0b00000100), which is simply 4 in our standard decimal system.  Now what's this '|=' symbol all about?  Programmers like shortcuts, as they make writing code faster, but unfortunately they confuse the hell out of new programmers (like myself).
'A |= B' is shorthand for:
"OR each bit in 'A' with each bit in 'B' and then store the value in 'A'". Mathematically, it's equivalent to:
A = A | B
Which is completely different from, and not to be confused with:
A = A || B

'|' means OR each individual bit on the left with each individual bit on the right.  If 'A' and 'B' have the following values:
A: "0b00010001"
B: "0b01010000"

Then after executing A=A|B, A will have the value:
A: "0b01010001"

On the other hand, if we use '||', we're telling the uC to OR the entire left side with the entire right side.  In computer programming, any value besides zero is considered 'TRUE'.  Thus, the result of:
A || B
is simply '1', because '||' is simply a logical OR, and at least one side of said OR is TRUE (in fact, both sides are).  Thus, the result of "A||B" is simply 'TRUE'.  So now the calculation becomes:
A = TRUE
which doesn't make a whole lot of sense, since 'A' is an 8 bit register.

...

So let's put it all together:
 DDRB |= ( 1 << DDB2 );
 -finds the index of pin16 (DDB2), which is '2'.
-shifts the number 1 (i.e. 0b00000001) '2' places to the left, resulting in the mask '0b00000100'
-bitwise ORs the port DDRB with the mask '0b00000100' (note that the other 7 pin values remain uncharged, hence why we created the mask in the first place).
-stores the result in DDRB

At the end of the day, that one line of code changes pin16 to an output.  Nothing more, nothing less.  See if you can fill up that taco a bit more before continuing.

...

Yum, tacos!

...

So now we've completed the conceptually simple task of changing pin16 to an output.  Now we need to set the output value.  First, let's make pin16 output a logic 1 (a.k.a 'HIGH', a.k.a. 5 volts on my board):
PORTB |= (1 << PORTB2)
Far fewer new concepts this time.  In fact, we're doing the exact same thing as the previous example, except that we're reading from a different variable and writing to a different hardware registers.  New concepts:
'PORTxn' is similar to 'DDxn' in the previous example.  On the 64M1, the two values are interchangeable, although this is bad practice, as larger uCs might not assign the same index to multiple hardware registers linked to the same hardware pin.  As before, 'PORTB2' is simply defined elsewhere in the hardware abstraction layer as '2'.  Humorously, Atmel confuses the two in their example program.  Great way to start someone out ;).

'PORTx' is an 8 bit register whose individual bits each represent one hardware pin.  When a given pin is configured as an output (as we did in the previous example), writing the corresponding pin HIGH causes the actual output voltage on that pin to go HIGH (i.e. 5 v on my board).  Likewise, writing the corresponding pin LOW causes the actual output voltage on that pin to go LOW.  On the other hand, if the pin is configured as an input, writing a '0' to a specific hardware pin enables an internal pullup resistor*, whereas a '1' disables said pullup, causing the pin voltage to float (i.e. tristate).

*The 64M1 has a single bit "PUD" in the MCUCR register that must also be set low to allow any pin to enable it's internal pullup.  I'm glossing over this for now.

So now we've configured pin16 as an output and made the output HIGH.  It should be simple to make pin16's output LOW, right?  Let's see:
PORTB &= ~(1<< PORTB2)

Hmm... the names are the same, but now we need to understand a few more Boolean algebra concepts:
(1<
<PORTB2) is the same as before: '0b00000100'
'~' bitwise negates the above into '0b111110100'

'&=' is similar to '|=', except that now we're bitwise ANDing PORTB with '0b11111011'
Thus, '0b11111011' is simply a mask that only changes the value of pin16.  Specifically:
-The goal was to make pin16 output low
-pin16 is at index 2 (i.e. the third bit from the right, because the first bit is zero indexed).
-index 2 is '0' in the mask (0b11111011)
-when we AND index 2 ('0') with the previous output value ('PORTB'), the result is always zero.
-when we AND each other index ('1') with the previous output value ('PORTB'), the result is always the previous value.
Thus, we've successfully changed pin16's output to LOW, without changing any other pin on the port.

...

Phew!

Now there's one last concept before we call it a day.  What if you want to change the state of two pins in the same port to LOW in the same operation?  Let's say we want to change both pin16 and pin23 to LOW.

Previously in our code, we must have defined these pins as outputs (remember: the default value is always input until reassigned).  We already know how to do this with pin16:
DDRB |= (1<
<DDB2);
Then we go back to the data sheet to find out pin23 is "PB3".  Thus, we could have individually assigned pin23 as an output in a similar fashion:
DDRB |= (1<< DDB3);

Perfectly valid to do them one at a time, but it takes fewer cycles to lump them all together and write the port register once:
DDRB |= ( 1<
< DDB2 | << DDB3);
Thus, our mask becomes '0b00001100'.  Then we bitwise OR it with the existing values and BOTH pin16 & pin23 become outputs in the same instruction.  Boom, it's that simple!

To complete our example, we just need to make both pin outputs LOW:
PORTB &= ~( 1<< DDB2 | << DDB3);

As before, we create a mask (i.e. everything inside the parenthesis): '0b00001100'
Then we negate the mask with '~': '0b11110011'
Then we bitwise AND the mask with the previous PORTB values, and presto(ish), we've set both pin16 and pin23 LOW, and it only took THIS ENTIRE POST to learn how to do it.

Isn't programming fun?

...

So obviously these are very low level tasks, and people can abstract them into higher level functions.  So why not use those?  It turns out Atmel's software abstraction layer is REALLY horrible, particularly because the software documentation is - shall we say - sparse.  Worse still, Atmel's Software Framework (ASF) attempts to unify every single product they sell into a single programming paradigm, which means abstractions are placed on top of abstractions, which are placed on top of abstractions, etc.  By the time we get to these so-called "easy to use" functions, they're so abstracted that it's really hard to understand what's going on at the hardware level.  I waded through hundreds of files, found a few bugs, wrote some horrible code, and then finally gave up, threw a virtual match in Atmel's ASF, and started from scratch.

I did keep their lowest-level abstraction layer, which tells the compiler which specific address belongs to "PORTB".  I then created a single-layer hardware abstraction that contains the concepts discussed in this post (and many more).  And while it took a little more time to figure it out, now I understand it, and the code base is simpler.  I've given up 'portability' - which is the concept that I could move an ASF project to a different processor with relative ease - but I don't plan on doing that.

That's all for now. 

Wednesday, February 24, 2016

Nearly another year

Hello to anyone still using an RSS reader (I suspect y'all are my only audience, due to my infrequent updates).

I've started a new project to replace the battery management system in the Honda Insight.  You, fine reader, are picking up the project in medias res; most of the work I've done up to now is available  here (nerd alert).  Project 'Linsight' is inspired by my intense fascination with the Insight, battery management, and electrons in general. 

So what am I doing? 
In short, I'm replacing the OEM ('original equipment from manufacturer') computers that control everything about the hybrid system.  Specifically, I'm replacing two computers and the entire hybrid battery. 

Why am I doing this?
The Insight was the first hybrid car sold in the USA.  In short, the technology is severely outdated, and IMO wasn't good to begin with.  Linsight's two main goals are:
-to provide an open source hardware platform to allow complete control of the hybrid electronics. 
-to allow lithium batteries to power the hybrid system.

What is Linsight's status?
I designed the hardware from December2015 to January2016, and received the revA PCBs a few weeks back.  Designing the hardware required me to completely reverse engineer over 100 signals, using spotty service manuals and folklore internet data.  I ultimately probed every single signal with an oscilloscope.  The OEM wiring harness is cut to pieces, but I don't care because I bought three more (and have kept one in pristine condition for the final install).

Hows the RevA PCB?
Mostly good.  There are a few minor hardware changes for revB.  The revA PCB won't work in an actual insight due to one particularly deceptive marketing issue in one of the microprocessor data sheets, but I can fully test out the entire system regardless.

What work is left?
A lot!  As of today, Linsight has nearly zero usable code running on its uCs, so it doesn't do anything except powerup and toggle its pins on command.  I thoroughly understand the OEM system behavior, but I've had to ramp up my knowledge a ton over the past month.  Except for 'baby' Arduino projects, Linsight is the first major microcontroller (a.k.a 'uC') project I've done in 15 years.  Saying "I'm rusty" doesn't even begin to capture how I feel; microcontrollers have advanced just as quickly as personal computers, so pretty much everything I knew back in 2000 is obsolete.  True, the C programming language hasn't changed, but the hardware and software implementations are considerably more complex (for better or for worse).

The Linsight PCB has two onboard microprocessors:
-Atmel AT32UC3C1512C (a.k.a. 'UC3C')
-Atmel Atmega64M1

The UC3C is a behemoth that runs the entire show, but has some of the worst programming documentation I've ever seen in a released product.  I wish I'd known this before I selected the UC3C, as the hardware design and documentation is actually stellar. I could write for hours on why I'll never use another Atmel 32 bit microcontroller, but that's what's on the board today, so I hope I learn to love it. 

The 64M1 is substantially simpler, as it's only used to actually spin the hybrid-electric motor.  Linsight uses a dedicated motor microcontroller because the control algorithm is time-critical.  Without a dedicated motor uC, the entire system would require tight timing requirements to ensure the motor phases were driven at precisely the correct time; with two uCs, the timing is relaxed throughout the rest of the system.

...

I've decided to resurrect electrosanity.blogspot.com in hopes that documenting my struggles proves useful to any future readers.


Tuesday, March 31, 2015

Wow, it's been well over a year since my last post!

So much has happened, but as usual I've run out of time to talk about it.  I'm adding the following documentation in hopes that google scrapes this site, so other battery hackers can learn from my tinkering with a Chevy Volt battery.

I couldn't find any information on manually charging the 2013-2014 Chevy Volt's LiNiMnCoO2 battery, so after reading a bunch of research papers and verifying the data with some empirical measurements, I present the following charge/discharge advice:

The 2013-2014 Volt's battery is made up of three parallel LG Chem 15 Ah cells per segment.  While the cell can handle up to 4.5 V peak, the lifetime is considerably shortened (less than 100 cycles) under these conditions, and voltages much higher can cause thermal events (i.e. a fire).  At 4.4 V, the lifetime increases to 125 cycles, which isn't much better.  The main reason for shortened lifetime is the carbon dioxide evolution and Mn dissolution into the electrolyte.  I've defined lifetime as the point where less than 60% of the original charge remains... which is a lot of lost charge (I'm being generous).  In short, don't overcharge lithium cells!

And thus we must charge to lower voltages and give up the additional storage capacity.  At 4.3 V, we're up to 300 cycles.  At 4.24 V we're up to 500 cycles or so.  At 4.1 V we're somewhere around 1000 cycles.  Cycle life continues to drastically increase for a couple hundred more mV, but then it levels off.

In short, I've chosen to CV charge each cell to 4.08 V to maximize lifetime.  Once 4.08V is hit at 1C, there's an additional 2.5 or so Ah (~5% of nominal capacity) if you hold the voltage constant and regulate current.  I'm building a BMS that'll just cutoff the charge at 4.08 V, as using a smaller section of the overall SOC also helps increase battery lifetime.

...

Cycle life is also greatly reduced by over-discharging cells.  There's not much energy left below 3.4 V, so it's certainly not worth the decreased lifetime to drag them all the way down to 3.0 V.  I found less than 2 Ah between 3.0 and 3.4 V (less than 5% nominal capacity).  Considering the decrease in lifetime is exponential once you drop below 3.2 V, it's just not worth the sacrifice milking every last Ah.

...

tl;dr: For maximum lifetime @ 1C:
-CV charge the Chevy Volt's LiNiMnCoO2 cells to 4.08V,
-then CC to C/50 (1 A) if you want an additional 5% capacity.
-Discharge the cell no less than 3.3 V, or 3.2 V if you want that last 5%.

Friday, August 2, 2013

Random thoughts on buying/charging batteries.  

Both sealed and wet lead acid are self balancing as long as you drop the current down to below ~C/5 once you hit a predefined voltage (for example, 14V on a 12V car battery).  Lead acid self balances by converting the excess charge passing through the cell into heat - without further increasing - which is why you simply need to lower the current... otherwise, the heat will damage the cell.  The voltage will not rise above final charge unless the current is too high.

Commercial batteries are cheaper than individual cells due to volume.  If you buy less than several thousand individual cells, there's a huge transactional cost; batteries are a commodity item.  Buy someone else's battery pack and use the cells inside.  DeWalt used to be the de-facto source for A123 cells, but they've migrated over to Panasonic cells, which are also good, but can't carry as much current. 

Lithium cells don't regulate voltage by converting excess charge into heat.  Lithium's energy storage mechanism  doesn't roll off the voltage before the cell is damaged.  Thus, as the cell reaches full, the voltage will begin to rapidly rise (within 30 seconds or so).  Exceeding some nominal voltage (3.65V for LiFePO4) rapidly degrades the cell.  For LiPo, after a minute or so, the cell can ignite... the voltage rises rapidly enough on a charged cell that the constant current ends up dumping considerable power into the charged cell, which heats it real quick. 

There are several off the shelf charging solutions for cell packs less than 6/8S.  Check out the RC helicopter scene for recommendations.  Most of these charges are very inefficient, as they simply place a transistor across the battery leads and partially enable it through a resistor as the cell gets full.  Thus, as soon as the first cell gets full, you've got to drop the charge current to whatever your transistor can handle... since you're just moving the heat source from inside the battery to outside the battery... at the end of the day, you've got current moving across a voltage (the cell's), which equals power (heat) that you've got to get rid of.   

There are more elegant methods to using this excess charge rather than turn it into heat, one of which I'm working on a TBA design for, and the other of which requires isolated charge pumping of charge from one cell to another via a transformer or capacitor... see the LTC3300 for a nice new design from Linear Tech.

Sunday, March 3, 2013

Battery Tab Welds Failing

It's been 3 years since I got the ebike running.  Everything is holding up except the batteries, which have slowly declined in performance, such that I can now ride ~30 miles between charges... it hasn't gone unnoticed, but I've been too lazy to do anything about it.

The pack consists of 192 A123 2.3Ah cells, configured as 16s12p (nominally 52.8V/27.6Ah/1457Wh).    Lately, the pack is empty after ~530Wh, which is 3x less capacity than nominal!  I'd noticed that a few particular banks were consistently underperforming, so today I rode around until the first cell hit 2.5V (0% full)(10.4Ah), then I used a 300W programmable load (BK8500) to pull the other 15 taps down to 2.5V, recording the additional ampacity of each individual bank.  The results were sub-optimal, with the lowest bank at 10.4Ah (37% of nominal), and the highest bank at 22.9Ah (83% of nominal; acceptable).  Overall, 11 of 16 banks exceeded 20Ah (72% of nominal).  The remaining 5 banks were 19.9,19.4,16.4,12.9,10.4Ah... pretty bad.

At this point, I thought "well shit, that's what you get for not running a battery management system on the cells... over/undercharged all the time and this is the price you pay."  But then I noticed a broken tab on the 10.4Ah cell... then another broken tab, etc.  All said, there were 11 broken tabs (as noted):

A1: 22.4Ah
A2: 22.4Ah
A3: 19.4Ah - 2x broken tabs
A4: 21.4Ah
A5: 19.9Ah
A6: 20.4Ah
A7: 12.9Ah - 4x broken tabs
A8: 21.0Ah
B1: 10.4Ah - 5x broken tabs
B2: 16.4Ah - 0x broken tabs :(
B3: 20.1Ah
B4: 21.4Ah
B5: 21.9Ah
B6: 22.4Ah
B7: 22.9Ah
B8: 21.4Ah

What does this mean?  A broken tab means the particular cell isn't connected.  For example, B1 had five broken cells, which means that only 7 of the 12 cells were actually connected.  Thus, one would expect the measured nominal ampacity to decrease to 58% of nominal (16Ah).  Since our measured ampacity was 10.4Ah, B1 hits 65% of nominal (7 cells) instead of 37% (with 12 cells).  65% is still lower than the other 'healthy' cells, but hints that fixing the broken tabs could yield nearly twice the distance travelled on a single charge.  Doing some quick math estimating all cells connected, the expected ampacity becomes:


A1: 22.4Ah
A2: 22.4Ah
A3: 23.0Ah - expected
A4: 21.4Ah
A5: 19.9Ah
A6: 20.4Ah
A7: 19.4Ah - expected
A8: 21.0Ah
B1: 17.8Ah - expected
B2: 16.4Ah
B3: 20.1Ah
B4: 21.4Ah
B5: 21.9Ah
B6: 22.4Ah
B7: 22.9Ah
B8: 21.4Ah

What does this mean?  Simply rewelding the broken tabs (aside: weld != solder; you must use a high current battery tab welder!) would increase total usable energy from ~530Wh to ~840Wh... a 58% increase!  So I'd be crazy not to just do this, right?

Wrong... For the past 3 years I've internally loathed the fact that the existing battery pack construction didn't allow for any method to implement a high current, active battery management system.  In order to add a high current BMS, I'd need to be able to switch each bank in and out, such that once a cell hits empty or full, it is removed from the stack.  Since both the charge and discharge use cases are current controlled, the actual stack voltage isn't relevant; decreasing voltage limits maximum power, but otherwise doesn't matter.

Instead of fixing this flawed design, I've decided to take this opportunity to iterate the design.  I am literally going to tear the battery pack down to individual cells and start over.  It's a tough decision, but I've had three years to mull it over.  One major contributing factor is that I don't actually own a battery tab welder, which means I can't fix the pack as-is.  Soldering to cells is difficult and damages the chemical structure, since the entire cell acts as a heat sink (and must heat up enough to flow solder).

In a previous post, I mulled over and then ultimately decided against using/building a battery tab welder for three reasons: 
1). Turn-key battery tab welders are expensive
2). DIY battery tab welders are a science, and can melt through tabs, etc.
3). The original A123 DeWalt drill packs were already tab welded in series together... I simply soldered 12 packs together across the existing leads to ensure there were only 16 voltages (instead of 192 in the case that the 12 parallel packs were each only connected at the stack plus and minus).  

(1) is still true, but now that (3) is no longer true (tabs have separated), (2) seems like the logical solution.  Plus - and this is a big one - now I get a chance to start all over again and implement a BMS, without having to purchase 192 new cells (at $13/each = $2500).  An even better benefit is that once I get all the cells apart, I can test the cells in the weaker banks individually, and replace just those that need replacing, thus yielding 20+Ah on all banks, which would equate to 1000+kWh (a 93% improvement).  

Of course, this requires I either purchase/borrow a turn-key battery tab welder (1), or build my own (2).  Conceptually, a battery tab welder is simply a power supply, a large capacitor, a silicon controlled rectifier (a switch you can turn on, but not off) and two solid cables with copper tabs.  The magic is in how much energy - and at what voltage - you store in the caps.  I'm going to look around and if I can't find something cheap, I'll begin experimenting with the tab welder.

The new design should ideally fit in the existing mechanical enclosure.  I think this is possible.  Of course, another option is to remove half the cells, which would continue to give me the range I'm getting now, but I'd rather have a much larger range and complete idiot-proof batteries.  More thoughts to follow.

Tuesday, November 13, 2012

Meltdown

I just spent 11 hours lowering a circuit's power by 400mW.  Our total budget is 76500mW, and we're getting innovative trying to get there.  How much is 400mW?  
-If you're using a laptop, it's less energy than your computer consumed while you read 'ab' (c,d,etc).  
-In 24 hours, that saves enough energy to run your AC for 8 seconds... brrrrr!
-In full sun, it's the conversion rate of a 1x1" solar panel.
-Enough to keep our product from overheating (?).

One problem with standards is they tend to box you in.  In our case, we started with a 100000mW budget, which is equivalent to the power consumed by one of those old incandescent light bulbs your kids will never know existed.  Instead of lighting your room, our product allows you to digitize data at 56,000,000,000 samples per second.  For comparison, the music you're listening to tricks your ears into perceiving continuous sound at a glacial 44,100 samples per second... we're one million times faster... and with our combined power savings, we're doing that at only 76000mW... each digitized sample requires only 1.3uJ (roughly equivalent to a strand of hair falling from your head 5mm to your pillow as you sleep).    

Sunday, June 3, 2012

Slideshow

Two solid anchors:

Outside our thermal chamber room:

diagAbby road:

Viborg, SD.  I prefer a shotgun, but whatever floats your boat.

An homage:

36A will flow through my circuit, but forgot to size the wires leading to it... result, smoke:

Said circuit, routing 36A with measured 21.6mV voltage drop... that's 600uOhm if you're keeping score.

Components hacked onto our $2,000,000 (thus far) new design... my job for the past 9 months:

A buddy's parents bought some new land on a cliff.  We bushwacked the first of (hopefully) many routes: Provisional Endeavor, 5.10c:

Lizard at the base of one of our rigged anchors:

 Howie rocking my back (front):

Taking a measurement that requires you to leave the cube (otherwise there's too much static, movement etc, all of which affect the measurement):

A giant stick bug checking out a Pelican case:

 Night climbing at Gus:

A throwback to AE, supporting a customer, literally banging my head against the floor:

Finishing the annual 100 mile Shiner Ride in 5+ hours on a single speed.  My complements to Kreutz; next time I'll be sure to "visibly display [my] race entry tag" so you can properly categorize who I am.

The knob of an X-07 Group 1-R electronic lock... 
...the only lock design that has never been successfully picked.  The knob turns a generator that causes a computer in the locking mechanism to spit out random numbers on the top.  You stop turning the knob when the number hits your code and then spin the other way... there's no way to 'feel' inside the lock, hence it's impossible to determine the correct combination, which can be up to 6 codes long (i.e. one trillion possibilities).  Of course, the likely entry method is now a skill saw with a cutting wheel, but for cheap on eBay, it's definitely neat.

An M16A1:

Hidden behind the safety-selector switch is the magical word "AUTO".  One trust, 20+ hours of research, and many, many hard-earned dollars later, I anxiously await an ATF envelope containing a cancelled $200 tax stamp, like the one below:
I'm actually waiting for two (2) tax stamps: one for the M16, another for the silencer I'm going to put on the end of it... because there's nothing more Texas than a silenced, fully automatic, short barreled rifle.  Absolutely illegal in California in every possible manner.  
Aside: you can get around many of CA's anti-assault-weapon laws with the $15 Bullet Button.

240 in Motion

Lee and I are renting a 3/2 on wonderful Shoal Creek Blvd (as previously described).  Lee wanted the master and I had no preference, so I plopped my huck sack in between his room and the third.  18 months later, Lauren officially joined rank at The Purple Palace – our defiantly non-homosexual purple house – but wanted a larger closet.  Since I have all of 20 T-shirts and 10 pairs of shoes (6 pairs climbing), I obliged and moved into room 3, formerly occupied by useless junk, a box of stale cat shit, and other useless things your grandmother likely keeps for the next depression.

I hadn't even finished putting all of said crap out on the curb, when lo-and-behold, your grandmother (likely) showed up and hauled all of the 'treasure' away.  
Broken stool? "Check."   
Air conditioner with no knobs? "Check!"
Rickety table with typewriter-return sway? "Now I can use a keyboard!"
Sadly, the cat shit remains as a welcome greeting just outside our garage door.  Your grandma obviously isn't a cat lady.  Gold mine.

...

With the room empty, I began to stage the outlets.  What does this mean?  As engineer, I have a higher-than-normal propensity to occupy wall sockets at near-peak levels.  Thus, I need to strategically plan out where the tables go.  Otherwise, I'm likely to place a table insularly from any plug... and what's a table without an outlet?  Exactly.  

Usually, I like to place the tables such that I can route all the plugs to the table with the minimum number of extension cords.  The far side of the room appeared devoid of any plugs, so I staged the tables in the opposite corner (three workspaces total).  There was one wall plate that had no plugs... nothing but a blank face.  Curious, I opened the box and found a standard 3-wire Romex cable that you'd find in any house built during the last 40 years or so.  The odd thing is that this house was built in 1953 – back before trivial things like safety existed – so I knew right away it was not a standard plug.  After pulling the leads out, one additional clue was the white neutral lead wrapped with a layer of black tape, indicating two hot leads.  Conclusion: my new room has a single phase, three wire 240V plug in it...

...wait for it...

...!!!!!!!!

What does that mean?  For starters, if I were European, I could use all my electronics without having to buy a transformer to bring the ass-backwards 120V US grid up to 240V.  Also, assuming I had a 3300 Watt, 240V input, 0-300VDC output baller power supply, I could plug it in my room instead of out in the laundry room in place of the dryer.  What's that?  I do have a 3300 Watt variable output 240V power supply?  Oh, well I'll just install that here:
Also, since the entire house lacks a ground (except for one outlet in the kitchen and another in Lee's room), I was able to connect the neutral lead (nominally unused on a 240V three-wire single phase setup) to the ground of one outlet in my room.  Now I don't need to daisy-chain an extension cord through the hallway to Lauren's room, which is great because I'd be plugging that cord into a surge protector that is powered by another extension cord that sources from that one outlet I previously mentioned in Lee's room.  One day our house will burn down.

So what on earth do I use this power supply for?  
To start, I actually have two of them:
They are pretty.  Combined, at full output they consume more power than a clothes dryer, a normal-sized air conditioner, or your puny electric car charger.  Since they are completely isolated and both voltage and current variable, you can hook them in parallel, series, etc, and derive pretty much any DC voltage you'd ever need at any current you could practically consume.  

As previously mentioned, I use the upper supply to charge the electric bike (the original focus of electrosanity).  I purchased the bottom supply to balance NiMH batteries, such as the ones in the Honda Insight, Civic, Toyota Prius, and pretty much every other non-exotic electric car on the market.  While I only need a 60V supply on the electric bike, I need a 320V supply on these mega-packs since they place all cells in series.  Here we are in action, charging customer #1's Insight pack:
I thought connecting wires to spherical magnets was an excellent method to quickly attach/detach voltage/temperature diagnostic leads to the bolts holding the battery bus bars together, but it turns out that heat demagnetizes neodymium.  I experimentally determined that it is possible to flash heat the magnet and then attach a lead, but the magnetism is variably compromised and the quick solder job is crap.  

Also, it turns out the magnetic surface resistance is highly variable, likely due to minimal surface contact, poor surface malleability, and attracted oxide-debris.  Here you can see a terrible 143.4 Ohm path resistance through two magnets: 
After experimenting some more, one solution is to place a magnet behind a nickle-plated copper conductor and have the magnetic field pull that known-good conductor into the bolt.

More customers to come (and details), once Lee and I find more time.

...

I finally built an enclosure for a pair of really nice class A amplifiers I picked up from a really crazy electrical engineer during yesteryear's internship:
At idle, they hover at ~140W, equating to 1% efficiency at casual listening levels.  Why so inefficient?  Class A amps don't have any crossover distortion.  The tradeoff is you have to keep the transistors burning full time... I still need to build a power supply that can handle this constant load, but for now I'll keep using the aforementioned baller power supplies.

Note the speakers in the above picture are an $80 set of 15 year old Sony bookshelf speakers... for the money, I've never heard anything better; they are a cherished personal reference.

On the opposite end of the spectrum, the Decentralized Dance Party came to Austin, necessitating a giant gold boombox:
Somewhere in the haze:

...

My commute home involves an 8' fence after 6:00 PM, if I want to avoid a shoulderless death trap:
Last week Qimikom took an 8' fall after dislodging his pedal from the fence.  Just a few scuffs on the saddle.

Groceries:


   

Instructed Bliss

Every few months, my chiropractor takes some measurements and then presents a body state-of-health analysis.  One key specification is "nominal head weight": how far my geeky head juts out.  Computers, bicycles, and standing awkwardly over my peers in some distant past effectively make my head weigh 40 pounds; I like to think it's all the information I try to pack into it, but thoughts have no weight in that sense (only when actioned).

On our way out the door last night, I glanced up at a geeky Spanish Oak perilously jutting his brawn over The Purple Palace and mused "I sure hope you don't fall over before our lease is up."

::night filled with wonderful discussion::
Scene: Next morning, lying in bed with Lauren:
Suddenly (on the time-scale of tree), El Arbol's five decade fight with gravity ends in a deafening crack, crashing towards the ground in plain view as I watch in astonishment.  Moments after the crash:


Looking out for our well-being, El Arbol heroically unwound his tonnage counter-clockwise away from the roof, steered his perilously-increasing, newfound momentum over his young peers, picked a clear section of cold, hard earth and set course for crash, with a seismic thud on absolutely nothing, missing ceramic planters, cars, other trees, windows, the house, and everything else of any perceived value.  Phew!

...

I'm sure our landlord had better things to do than hire a tree service; such is the joy of renting.  In the mean time, I decided to cut a pathway through our new-found privacy curtain:
As an engineer, I had two main questions: 

Was El Arbol still under warranty?
No

Why did El Arbol fail? 
Root cause:  While the fallen upper plumage was in good health, after scaling the Spaniard's still-standing trunk, we observed a disease affecting the remaining lower branches.  Said disease migrated into and subsequently weakened the trunk, causing a focused failure just above a particularly infected branch:
The remaining branches on the trunk are terminally affected and the entire tree is likely to die.  El Arbol, you will be missed.  This quarter's grand total: -1 cat, -2 trees, +1 room mate.  Howie is doing a wonderful job holding down the fort:

...

With a sample size of 1, I've determined that lemons float in water, while limes sink:
It's unfortunate that it's not the other way around; I much prefer lemon juice.  

While having a particularly shitty day, I also determined that 4 eggs are very unlikely to do the following (entirely accidental):


Tuesday, February 28, 2012

Where have I been?

Texas gerrymandering at its finest:  On my short 7 mile bike ride to work, I cross through three congressional districts: 
My house is in district 10... representing the republican stronghold of Houston, and whipping its tentacles all the way to my liberal town, covering a small swath of republican outposts in between.

At mile 2, I cross into district 25... representing the Santorum-hailing town of Fort Worth, stretching all the way down through George Bush's far-right backcountry.

When I cross the property line at NI, I'm suddenly in the small Austin peninsula that is district 17, which spreads both north and east in excess of 100 miles, covering cattle ranches and rural east central Texas.

If, after work I want to go climb, the 4 mile bike ride somehow wanders into district 31, which heads north on I-35 and covers Perry's greatest fans.

How is this legal?  Two supreme court rulings have not fixed the root moral issue.

http://gis1.tlc.state.tx.us/
Select map "PLANC235 - COURT-ORDERED INTERIM CONGRESSIONAL PLAN"


More posts soon!  I've taken a hiatus due to some serious time commitments.  I've been working on the final BMS schematic I've been dreaming up for over a year now... info to follow.