Given a density $f$ we pose the problem of estimating the density functional $\psi_r=\int f^{(r)}f$ for a non-negative even $r$ making use of kernel methods. This is a well-known problem but some of its features remained unexplored. We focus on the problem of bandwidth selection. Whereas all the previous studies concentrate on an asymptotically optimal bandwidth here we study the properties of exact, non-asymptotic ones, and relate them with the former. Our main conclusion is that, despite being asymptotically equivalent, for realistic sample sizes much is lost by using the asymptotically optimal bandwidth. In contrast, as a target for data-driven selectors we propose another bandwidth which retains the small sample performance of the exact one.