Group:  Microsoft Excel » microsoft.public.excel.sdk
Thread: unique function instance identification

unique function instance identification
Robert Schiele <schiele[ at ]newsgroup.nospam> 12/2/2008 10:10:08 AM
Hi there,

I have a problem I didn't find an API function for so far. In the hope that
someone out there has a smart idea on how to achieve the goal I will ask the
question in this group.

I am working on an application that has a user defined function implemented
(let's call it "foo") in a XLL. The function calculates a numerical result
but by design does not get it's input data from regular Excel parameters but
from an external data source. The external data source is aware of the
correct result as soon as it can identify the location of the function in
the Excel sheet.

This can easily be solved as long as no more than one instance of this
function appears in each cell because I can just query the row and column of
the cell I am called from.

A problem comes up as soon as a user enters the function in a cell formula
at two places like "=foo()+3*foo()". So far I have not found a solution on
how to determine within the evaluation function whether I am currently
evaluating the first instance of "foo" or the second one.

Does anyone here have an idea whether there is a way to detect this?

I mean a workaround would be to require the user to submit a unique
parameter like "=foo(1)+3*foo(2)" but this does not look like a very user
friendly solution and thus I would prefer if someone came up with a more
elegant solution.

Robert
Re: unique function instance identification
lab27 <lee.benfield[ at ]gmail.com> 12/3/2008 6:00:47 PM
see inline

On Dec 2, 10:10 am, Robert Schiele <schi...[ at ]newsgroup.nospam> wrote:
[Quoted Text]
> Hi there,
>
> I have a problem I didn't find an API function for so far. In the hope that
> someone out there has a smart idea on how to achieve the goal I will ask the
> question in this group.
>
> I am working on an application that has a user defined function implemented
> (let's call it "foo") in a XLL. The function calculates a numerical result
> but by design does not get it's input data from regular Excel parameters but
> from an external data source. The external data source is aware of the
> correct result as soon as it can identify the location of the function in
> the Excel sheet.
>

I assume from this that you're using xlfCaller to get a reference to
the calling cell? (to determine the location of the caller)

> This can easily be solved as long as no more than one instance of this
> function appears in each cell because I can just query the row and column of
> the cell I am called from.
>
> A problem comes up as soon as a user enters the function in a cell formula
> at two places like "=foo()+3*foo()". So far I have not found a solution on
> how to determine within the evaluation function whether I am currently
> evaluating the first instance of "foo" or the second one.

Can you clarify this? If your function (as you say above) ' is aware
of the correct result as soon as it can identify the location of the
function' then why does it matter if the function is called twice?
Does it return different results? Is the function side effect free?
(i.e. why does it matter that you've called it twice? - what's the
relationship of the calling cell to the result?)

>
> Does anyone here have an idea whether there is a way to detect this?
>
> I mean a workaround would be to require the user to submit a unique
> parameter like "=foo(1)+3*foo(2)" but this does not look like a very user
> friendly solution and thus I would prefer if someone came up with a more
> elegant solution.
>

If you're using the caller address so that you can return more than
one number from a large result set (for example), consider returning
an array (xlTypeMulti), and dumping the results into multiple cells at
once using an array formula.

If you REALLY mean that the result is a function of the caller, it
SHOULD be valid (if not efficient) just to compute multiple times.
(ok, you don't want to do this, I'm just trying to find out if the
caller address really is important!) - if this is the case, then you
can cache the return value based on cell address (xlfCaller again).
Since you just want to avoid recomputation of multiple values in the
same cell, if you're not using multi threaded (2k7) api, it's
sufficient to check (again, this is only true if your function is as I
infer above, cachable) that the caller is the same as the previous
caller - i.e. maintain the last caller and the last result, and don't
recompute if caller is unchanged.

If the above doesn't hold true, you might need to provide a few more
details ;)

> Robert

Cheers,

Lee.
Re: unique function instance identification
Robert Schiele <schiele[ at ]newsgroup.nospam> 12/3/2008 10:29:11 PM
lab27 wrote:

[Quoted Text]
> see inline

Thanks for investing time into this obscure sounding topic!

> I assume from this that you're using xlfCaller to get a reference to
> the calling cell? (to determine the location of the caller)

This is exactly how I planned to get this part of the information.

>> A problem comes up as soon as a user enters the function in a cell
>> formula at two places like "=foo()+3*foo()". So far I have not found a
>> solution on how to determine within the evaluation function whether I am
>> currently evaluating the first instance of "foo" or the second one.
>
> Can you clarify this? If your function (as you say above) ' is aware
> of the correct result as soon as it can identify the location of the
> function' then why does it matter if the function is called twice?

Because it could return different results for different instances of the
function.

> Does it return different results? Is the function side effect free?

It does always return the same result for the same instance but could return
a different result for a different instance where calling this function
twice within one formula is considered two instances.

The function does not have a side effect. It basically is just a placeholder
to tell Excel to get the actual data from an external data source that uses
some fancy algorithms to calculate the correct value for this specific
instance.

If there is only one instance of this function call within one cell it is no
issue to understand the structure of the formula there because as you said
above I can determine the calling cell. After that I can read the formula
content of this cell and analyze the structure of the formula and the
position of the call within that formula. With all that information the
external data source knows what belongs into that place.

For example consider the cell A1 contains "=3*foo()". When foo is called I
can determine that the calling cell is A1, read the formula string. With
that information the external application knows that the right number for
this place is 5. Thus we just return 5.

As soon as there are two instances of this function call within a cell I am
stuck after reading the formula content because I have no clue how to
determine whether I am about to fill in data for the first instance or the
second one.

For example consider the cell B1 contains "=foo()+3*foo()". When foo is
called the first time to evaluate the first instance of foo I can again
determine the calling cell B1 and read the formula string. Further the
application knows that the first foo() in the formula should return 17 and
the second one 42. But now I have a problem since I have no information
whether I am about to evaluate the first or the second instance within this
cell and thus have no clue whether I need to return 17 or 42.

> (i.e. why does it matter that you've called it twice? - what's the
> relationship of the calling cell to the result?)

I cannot describe the exact algorithm to determine the result here since
this includes confidential information. Thus I try to provide as much
technical information as needed without showing information on a more
abstract level that occurs outside the Excel application itself.

Just consider foo to be a function that calculates a hash value out of the
cell name and the position of the call within the formula of the cell. Thus
in the above example of cell B1 the first value would be a hash value of
"B1" and the number 1 (since the call is the first operand in the formula)
and the second value would be the hash value of "B1" and the number 3 (since
the call is the third operand in the formula).

> If you're using the caller address so that you can return more than
> one number from a large result set (for example), consider returning
> an array (xlTypeMulti), and dumping the results into multiple cells at
> once using an array formula.

Unfortunately one of the design requirements was that these placeholders
could be placed anywhere into a formula. Obviously if it finally turns out
that there is no way in Excel to determine the exact position of the call
within a cell formula we need to give up some of these requirements.

> If you REALLY mean that the result is a function of the caller, it
> SHOULD be valid (if not efficient) just to compute multiple times.
> (ok, you don't want to do this, I'm just trying to find out if the
> caller address really is important!) - if this is the case, then you
> can cache the return value based on cell address (xlfCaller again).
> Since you just want to avoid recomputation of multiple values in the
> same cell, if you're not using multi threaded (2k7) api, it's
> sufficient to check (again, this is only true if your function is as I
> infer above, cachable) that the caller is the same as the previous
> caller - i.e. maintain the last caller and the last result, and don't
> recompute if caller is unchanged.

Multiple recalculation is no issue. The issue is really just to get the
"right" answer at the "right" place.

> If the above doesn't hold true, you might need to provide a few more
> details ;)

I hope the above description and examples was more useful than confusing.
;-)

Robert
Re: unique function instance identification
lab27 <lee.benfield[ at ]gmail.com> 12/4/2008 11:03:02 AM

you /could/ do this (with caveats), but I'd really not reccomend it :P
(more inline)

On Dec 3, 10:29 pm, Robert Schiele <schi...[ at ]newsgroup.nospam> wrote:
[Quoted Text]
> lab27 wrote:
> > see inline
>
> Thanks for investing time into this obscure sounding topic!
>
> > I assume from this that you're using xlfCaller to get a reference to
> > the calling cell? (to determine the location of the caller)
>
> This is exactly how I planned to get this part of the information.
>
> >> A problem comes up as soon as a user enters the function in a cell
> >> formula at two places like "=foo()+3*foo()". So far I have not found a
> >> solution on how to determine within the evaluation function whether I am
> >> currently evaluating the first instance of "foo" or the second one.
>
> > Can you clarify this? If your function (as you say above) ' is aware
> > of the correct result as soon as it can identify the location of the
> > function' then why does it matter if the function is called twice?
>
> Because it could return different results for different instances of the
> function.
>
> Just consider foo to be a function that calculates a hash value out of the
> cell name and the position of the call within the formula of the cell. Thus
> in the above example of cell B1 the first value would be a hash value of
> "B1" and the number 1 (since the call is the first operand in the formula)
> and the second value would be the hash value of "B1" and the number 3 (since
> the call is the third operand in the formula).
>
> Unfortunately one of the design requirements was that these placeholders
> could be placed anywhere into a formula. Obviously if it finally turns out
> that there is no way in Excel to determine the exact position of the call
> within a cell formula we need to give up some of these requirements.
>

What you /can't/ do is find the TEXTUAL position of a given call to
foo in a formula.
But - you can find the TEMPORAL position, i.e. the Nth time foo has
been called IN THIS CELL. Excel calculates subexpressions left to
right [*].... (with the exception of 'if', for which it only
calculates the required branch[+] - see 1,2 below)

The problem is that they're not quite the same thing (almost!), and
even if they were, I'd feel REALLY uncomfortable relying on that fact.

You can do pretty much what I said previously - (again, assuming no
multi threaded XLL calls)

have a static variable "call counter" in your fn

* get caller address
* every time you see a call from a new cell, reset the call counter to
0.
* If it's a call from the same cell as the last call - increment the
call counter.
Therefore you know which cell is being called (xlfCaller), and which
'instance' it is (call counter).

Failings of this approach (other than that it's not quite what you
want ;) :

* You need at least 2 cells to calculate every time you calc the
sheet, otherwise you can never tell you've 'left' a cell, and reset
the counter.
* Unless you detect it (which you can, google for it), calls to foo()
from the function wizard will mess up this relationship.
* Won't work if you specify foo() can be called from multiple threads
(xl2k7 api option)

BUT (and it's a big BUT)
- I'm quite uncomfortable with your whole spec for a few reasons
tho....


1)

=if(foo()>10, foo(), 10) (foo() != foo())

Since foo() isn't repeatable (in a cell), you'll have some confused
users at some point (ok, that really is in your spec. but eww!)

2)

=if(false, foo(), foo()) (the second 'foo' is called, but the first
foo never is, so the lexically second 'foo' is temporally first -
excel short circuits 'if'[+]).

3)

=or(foo()>0, foo()>0, foo()>0) (foo is called three times, excel
doesn't short circuit booleans :P - will your users get confused?)

(etc etc)

4)

Making the calling cell an implicit parameter is not really a good
thing to do - it's fine if you're using it for something which doesn't
affect the calculation, (pretty much anything out of band, logging
etc) - but it's DANGEROUS if it does - you just know that someone who
doesn't understand the inner workings of your addin will insert a
column in your sheet at some point in the future, and cause all your
results to change. Or merge two cells into one. Etc....

And you haven't even got one implicit parameter, you've got two, one
of which is not quite the variable you want.

So - you /can/ do it, using the approach at the top, but I would run a
mile at this point!

Hope that helps, even if it's just to give you a chance to get a
better spec ;)

Lee.

------
[*] and there's no reason it should have to, but I doubt it'll
change. it even calculates stuff it doesn't need to - eg in ' =na()
*foo() ', foo() will be called, though excel could legitimately have
skipped it - I suspect the developers don't dare make these kind of
optimisations because people are doing nasty side effecting things,
which they shouldn't.
[+]Unless you've got R type arguments, and excel's doing speculative
execution in order to determine a calculation tree. But that's not
the case here.
Re: unique function instance identification
Chris <chris[ at ]chrisbird.com> 12/4/2008 4:30:53 PM
I would agree with the other comments - insomuch as a solution is
possible, but is a complete hack and would probably cause confusion
and problems later down the line.
Generally is you have a UDF foo() it should return the same value.
What happens if at some point the user want to display multiple
'lookups' on the same sheet. Making foo() dependent upon the cell
stops this happening.
One method which I have seen posted elsewhere (I believe there is an
example on codeproject.com) is for a particular formula to modify the
formula in the existing cell. For example:

in cell R1C1:
foo2()

would modify the R1C1 cell to be:
foo(R1C1)

Strictly speaking this is also bad practice - as modifying the cells
within a UDF is not allowed - but the example on codeproject does
this.

Alternatively, if your foo() function is perfoming some kind of lookup
- could you not return an array on values as the result of foo() and
then reference into the array?

Again, I would question the specification you have been given and
strongly recommend that you take an alternative approach.

Chris.

On Dec 3, 10:29 pm, Robert Schiele <schi...[ at ]newsgroup.nospam> wrote:
[Quoted Text]
> lab27 wrote:
> > see inline
>
> Thanks for investing time into this obscure sounding topic!
>
> > I assume from this that you're using xlfCaller to get a reference to
> > the calling cell? (to determine the location of the caller)
>
> This is exactly how I planned to get this part of the information.
>
> >> A problem comes up as soon as a user enters the function in a cell
> >> formula at two places like "=foo()+3*foo()". So far I have not found a
> >> solution on how to determine within the evaluation function whether I am
> >> currently evaluating the first instance of "foo" or the second one.
>
> > Can you clarify this?  If your function (as you say above) ' is aware
> > of the correct result as soon as it can identify the location of the
> > function' then why does it matter if the function is called twice?
>
> Because it could return different results for different instances of the
> function.
>
> > Does it return different results?  Is the function side effect free?
>
> It does always return the same result for the same instance but could return
> a different result for a different instance where calling this function
> twice within one formula is considered two instances.
>
> The function does not have a side effect. It basically is just a placeholder
> to tell Excel to get the actual data from an external data source that uses
> some fancy algorithms to calculate the correct value for this specific
> instance.
>
> If there is only one instance of this function call within one cell it is no
> issue to understand the structure of the formula there because as you said
> above I can determine the calling cell. After that I can read the formula
> content of this cell and analyze the structure of the formula and the
> position of the call within that formula. With all that information the
> external data source knows what belongs into that place.
>
> For example consider the cell A1 contains "=3*foo()". When foo is called I
> can determine that the calling cell is A1, read the formula string. With
> that information the external application knows that the right number for
> this place is 5. Thus we just return 5.
>
> As soon as there are two instances of this function call within a cell I am
> stuck after reading the formula content because I have no clue how to
> determine whether I am about to fill in data for the first instance or the
> second one.
>
> For example consider the cell B1 contains "=foo()+3*foo()". When foo is
> called the first time to evaluate the first instance of foo I can again
> determine the calling cell B1 and read the formula string. Further the
> application knows that the first foo() in the formula should return 17 and
> the second one 42. But now I have a problem since I have no information
> whether I am about to evaluate the first or the second instance within this
> cell and thus have no clue whether I need to return 17 or 42.
>
> > (i.e. why does it matter that you've called it twice? -  what's the
> > relationship of the calling cell to the result?)
>
> I cannot describe the exact algorithm to determine the result here since
> this includes confidential information. Thus I try to provide as much
> technical information as needed without showing information on a more
> abstract level that occurs outside the Excel application itself.
>
> Just consider foo to be a function that calculates a hash value out of the
> cell name and the position of the call within the formula of the cell. Thus
> in the above example of cell B1 the first value would be a hash value of
> "B1" and the number 1 (since the call is the first operand in the formula)
> and the second value would be the hash value of "B1" and the number 3 (since
> the call is the third operand in the formula).
>
> > If you're using the caller address so that you can return more than
> > one number from a large result set (for example), consider returning
> > an array (xlTypeMulti), and dumping the results into multiple cells at
> > once using an array formula.
>
> Unfortunately one of the design requirements was that these placeholders
> could be placed anywhere into a formula. Obviously if it finally turns out
> that there is no way in Excel to determine the exact position of the call
> within a cell formula we need to give up some of these requirements.
>
> > If you REALLY mean that the result is a function of the caller, it
> > SHOULD be valid (if not efficient) just to compute multiple times.
> > (ok, you don't want to do this, I'm just trying to find out if the
> > caller address really is important!) - if this is the case, then you
> > can cache the return value based on cell address (xlfCaller again).
> > Since you just want to avoid recomputation of multiple values in the
> > same cell, if you're not using multi threaded (2k7) api, it's
> > sufficient to check (again, this is only true if your function is as I
> > infer above, cachable) that the caller is the same as the previous
> > caller - i.e. maintain the last caller and the last result, and don't
> > recompute if caller is unchanged.
>
> Multiple recalculation is no issue. The issue is really just to get the
> "right" answer at the "right" place.
>
> > If the above doesn't hold true, you might need to provide a few more
> > details ;)
>
> I hope the above description and examples was more useful than confusing.
> ;-)
>
> Robert

Re: unique function instance identification
"Steve Dalton" <NOsteveSPAM(at)NOeigensysSPAM(dot)com> 12/7/2008 1:10:16 PM
Hi Robert

I wholeheartedly agree with the "question the spec and try another path"
advice given by lab27 and Chris. This is very messy and strikes me as being
equivalent to walk from the east coast of the US to the west coast by
heading east and then asking if anyone know a way of walking on water.

Another objection I have to the idea of your function is that it breaks the
basic assumption that =Foo2(Foo()) is equivalent to =Foo2(A1) where A1
contains =Foo(). As has already been said, you might cover that in your
docs but it's not very nice.

On the issue raised by lab27, it's my understanding that =IF(...) is not
selective in which of its arguments it evaluates, something I have confirmed
with a member of the MS development team about a year or two ago. Perhaps I
was misinformed, but it would hugely complicate each recalculation (and all
the dependancies) were this not the case. In my book, therefore, I recomend
that arguments to the IF function are kept as simple as possible with as
little repetition as possible. If anyone knows for sure this is not the
case I would be happy to be corrected.


Regards

Steve




"Robert Schiele" <schiele[ at ]newsgroup.nospam> wrote in message
news:u44WPaZVJHA.4148[ at ]TK2MSFTNGP05.phx.gbl...
[Quoted Text]
> lab27 wrote:
>
>> see inline
>
> Thanks for investing time into this obscure sounding topic!
>
>> I assume from this that you're using xlfCaller to get a reference to
>> the calling cell? (to determine the location of the caller)
>
> This is exactly how I planned to get this part of the information.
>
>>> A problem comes up as soon as a user enters the function in a cell
>>> formula at two places like "=foo()+3*foo()". So far I have not found a
>>> solution on how to determine within the evaluation function whether I am
>>> currently evaluating the first instance of "foo" or the second one.
>>
>> Can you clarify this? If your function (as you say above) ' is aware
>> of the correct result as soon as it can identify the location of the
>> function' then why does it matter if the function is called twice?
>
> Because it could return different results for different instances of the
> function.
>
>> Does it return different results? Is the function side effect free?
>
> It does always return the same result for the same instance but could
> return
> a different result for a different instance where calling this function
> twice within one formula is considered two instances.
>
> The function does not have a side effect. It basically is just a
> placeholder
> to tell Excel to get the actual data from an external data source that
> uses
> some fancy algorithms to calculate the correct value for this specific
> instance.
>
> If there is only one instance of this function call within one cell it is
> no
> issue to understand the structure of the formula there because as you said
> above I can determine the calling cell. After that I can read the formula
> content of this cell and analyze the structure of the formula and the
> position of the call within that formula. With all that information the
> external data source knows what belongs into that place.
>
> For example consider the cell A1 contains "=3*foo()". When foo is called I
> can determine that the calling cell is A1, read the formula string. With
> that information the external application knows that the right number for
> this place is 5. Thus we just return 5.
>
> As soon as there are two instances of this function call within a cell I
> am
> stuck after reading the formula content because I have no clue how to
> determine whether I am about to fill in data for the first instance or the
> second one.
>
> For example consider the cell B1 contains "=foo()+3*foo()". When foo is
> called the first time to evaluate the first instance of foo I can again
> determine the calling cell B1 and read the formula string. Further the
> application knows that the first foo() in the formula should return 17 and
> the second one 42. But now I have a problem since I have no information
> whether I am about to evaluate the first or the second instance within
> this
> cell and thus have no clue whether I need to return 17 or 42.
>
>> (i.e. why does it matter that you've called it twice? - what's the
>> relationship of the calling cell to the result?)
>
> I cannot describe the exact algorithm to determine the result here since
> this includes confidential information. Thus I try to provide as much
> technical information as needed without showing information on a more
> abstract level that occurs outside the Excel application itself.
>
> Just consider foo to be a function that calculates a hash value out of the
> cell name and the position of the call within the formula of the cell.
> Thus
> in the above example of cell B1 the first value would be a hash value of
> "B1" and the number 1 (since the call is the first operand in the formula)
> and the second value would be the hash value of "B1" and the number 3
> (since
> the call is the third operand in the formula).
>
>> If you're using the caller address so that you can return more than
>> one number from a large result set (for example), consider returning
>> an array (xlTypeMulti), and dumping the results into multiple cells at
>> once using an array formula.
>
> Unfortunately one of the design requirements was that these placeholders
> could be placed anywhere into a formula. Obviously if it finally turns out
> that there is no way in Excel to determine the exact position of the call
> within a cell formula we need to give up some of these requirements.
>
>> If you REALLY mean that the result is a function of the caller, it
>> SHOULD be valid (if not efficient) just to compute multiple times.
>> (ok, you don't want to do this, I'm just trying to find out if the
>> caller address really is important!) - if this is the case, then you
>> can cache the return value based on cell address (xlfCaller again).
>> Since you just want to avoid recomputation of multiple values in the
>> same cell, if you're not using multi threaded (2k7) api, it's
>> sufficient to check (again, this is only true if your function is as I
>> infer above, cachable) that the caller is the same as the previous
>> caller - i.e. maintain the last caller and the last result, and don't
>> recompute if caller is unchanged.
>
> Multiple recalculation is no issue. The issue is really just to get the
> "right" answer at the "right" place.
>
>> If the above doesn't hold true, you might need to provide a few more
>> details ;)
>
> I hope the above description and examples was more useful than confusing.
> ;-)
>
> Robert


RE: unique function instance identification
Fredrik Wahlgren 12/7/2008 11:19:00 PM

I have had exactly the same pronlem. My solution, if I remember correctly,
was to get the function string from within the xll. Then it can be parsed. In
your case, you need to search for two occurernces pf the string "foo". This
goes back a few years, I can't remeber the name of the function I used. I do
remeber that I wanted to know if my function was passed a string or a cell
reference.

/ Fredrik Wahlgren

"Robert Schiele" wrote:

[Quoted Text]
> Hi there,
>
> I have a problem I didn't find an API function for so far. In the hope that
> someone out there has a smart idea on how to achieve the goal I will ask the
> question in this group.
>
> I am working on an application that has a user defined function implemented
> (let's call it "foo") in a XLL. The function calculates a numerical result
> but by design does not get it's input data from regular Excel parameters but
> from an external data source. The external data source is aware of the
> correct result as soon as it can identify the location of the function in
> the Excel sheet.
>
> This can easily be solved as long as no more than one instance of this
> function appears in each cell because I can just query the row and column of
> the cell I am called from.
>
> A problem comes up as soon as a user enters the function in a cell formula
> at two places like "=foo()+3*foo()". So far I have not found a solution on
> how to determine within the evaluation function whether I am currently
> evaluating the first instance of "foo" or the second one.
>
> Does anyone here have an idea whether there is a way to detect this?
>
> I mean a workaround would be to require the user to submit a unique
> parameter like "=foo(1)+3*foo(2)" but this does not look like a very user
> friendly solution and thus I would prefer if someone came up with a more
> elegant solution.
>
> Robert
>

Home | Search | Terms | Imprint
Newsgroups Reader